similar machine learning project on topcoder - Deus ex machina
There's a data mining challenge from NASA that runs in parallel on topcoder.
It involves the same Catalina Sky Survey data
the source images are apparently 4110 x 4096, the images we get are 256x256 so about 5% of the original data.
Sample input data and detail descriptions thereof will be provided in
the forums. We have 180GB of data available. The input to the
algorithm will be:
4 raw images of the sky, captured roughly 10 minutes apart. The
resolution of the images is 4110 x 4096 and they contain 16 bit
values. Detection file associated with the 4 images. The file contains
a list image coordinates and additional information for the detected
objects. Known object file that contains a listing of known objects
near the standard field center, at the exact times of the 4
observations. Rejection file that contains the false positive
detections. We want to reproduce these rejections from only images,
detection and known object files. Detections may be rejected because
of common reasons such as faintness, bad image data, star bleed and
detector artifacts. A document describing these common reasons will be
posted in the forums.
by Dr.Asteroid scientist, admin
Wow - this got sent to the bottom of my listing - yes! We are also responsible for this competition. All the data that is in the topcoder will be presented - it has to be cut into much smaller chunks. There is no good way to send 4k x 4k images for humans to look at. So, instead of one giant image - you'll get around 260 smaller images (note we have to have overlap between images to make sure that the asteroids don't get missed between frames.) There's some variability in total number due to the variable overlap between the four images.
Edit: thank you for noticing!