top of page

General Framework for Data Collections

To download the data collection checklist that accompanies this post, click here!


In the last few months, we have received many requests for information on best practices in data collection and analysis. Markerless motion capture is a new collection modality, and accordingly, best practices are somewhat different from optical motion capture. In this blog, we will go from start to finish, describing how we set up our experiment and how we analyze the data, in an effort to streamline collections for you in the future.


Before this blog - please review the following blogs for some important background information:


General Setup and Calibration


Before a participant arrives, we run what we call a ‘spot check.’ The goal here is to make sure that the system is properly calibrated and functioning as expected. We recommend collecting a calibration trial and a short movement trial with the experimenter. Once both trials are collected, we analyze the calibration, check the residual values, and verify the origin of the reference frame is located where we think it should be. It should be at the same position in every view. The residual values should be ~1mm, and when you scroll through the calibration trial, the green spheres should be directly on each checkerboard corner in every view.


Sometimes the final reference frame location may seem to be off in space, located at the last detected checkerboard; if the residuals are low and the green spheres track the board well, the calibration is still valid. The reference frame can be adjusted using the “Adjust Reference Frame” functionality, which allows the reference frame to be moved and positioned in the desired location. Here you can see a view of an annotated chessboard (left) and the object calibration pane (right) to demonstrate how the adjust origin functionality can be used:



Here is an example of the reference frame in the same location in two views:



Here is an example of the reference frame in two different locations in two views. This requires the calibration to be performed again:



With the calibration complete, we run the analysis on the movement trial. Here, we are just checking the basics - is the skeleton on the person in each view, and are the views synchronized together? If they aren’t - you either loaded the wrong calibration or the calibration was not successful - there is no point doing any more collections until this is resolved.


This is how the skeleton appears if you run the analysis using an invalid calibration. Though the skeleton is close, it is not on the person and the results will be compromised:



These checks may seem trivial, but troubleshooting the calibration can be challenging when the participant has arrived, so we include it in our standard operating procedures for any collection.


The Collection


Once the participant has arrived, we tend to follow a checklist of the movements we want to collect. Because the collection process can be as short as a few minutes, being prepared with a detailed checklist and verbal instructions will help this process go smoothly.


Once the movements are collected and before the participant leaves, we like to analyze the last movement trial against the calibration that we previously analyzed. In this trial, we are checking that the skeleton is located on the person in each view, and that it is following them. Here, if one of the skeleton overlays is not on the subject, it indicates that a camera has been moved during the recordings and thus requires an additional calibration. If this occurs, we collect another calibration trial - we tend to analyze this second calibration after the fact to avoid inconveniencing the participant - but that is up to you. If this step is successful, it’s a good bet that the calibration file will be valid for all the trials that were recorded.


Because it’s possible to record quite a few trials on any given day, we format the data from that collection and add the calibration to those trials immediately after the collection is finished. If this is not possible because the data is still stored on the cameras, we make sure to do it at the end of the day. The result of this step is organized data that is ready for batch processing.


At this step we have:

  1. Verified the initial calibration in the space

  2. Checked the last trial against this calibration

  3. Organized the data and added the calibration to the raw files


At this point, it becomes clear if the naming convention of your files is difficult to manage for the particular experiment. We organize our data as follows: Subject->Action->Trial. So each subject has a folder of actions they performed, and each action has a list of trials. This allows us to apply different preferences for each task (if required). These are not hard recommendations, however, following these will make your life easier after the fact. Moreover, we run these organizational tasks using the “Organize Files” and “Assign Calibration” within the TMBatch companion application. If you find that you are manually renaming files and copying and pasting calibration files, this is often an indication that an upstream naming convention was inadequate.


After the Collection


Assuming everything above went as planned, we then batch all the data for one subject (or many) which generates our pose files. When setting up the batch, we verify the preferences within TMBatch, and make sure that the analysis options we have selected are consistent with the experimental protocol. Using the copy and paste feature in TMBatch allows this to go quite quickly. If you want varying preferences depending on the action, you can use the Search option on specific actions in TMBatch, and apply the copy and paste action to those files only. TMBatch will also flag errors at this stage (i.e., if you have multiple calibration files within your structure) and will show general meta data in the directory as well as the cutoff frequency for the analysis. After the batch is complete, we review the statuses of each trial to ensure that they are analyzed correctly. At this time, there may be errors - shown in red - that need to be examined more closely in the Theia3D GUI; here no errors have been flagged, so we're good to go!



Now I didn’t mention it explicitly above, but at this point, it's really important to have your analysis scripts set up which often requires a few subjects to act as pilots. We like to automate this analysis in Visual3D and Inspect3D (C-Motion products); it allows us to refine our experimental protocol and have output results as we go. There is nothing worse than at the end of the experiment, realizing that you didn’t collect what you needed for the question you are trying to answer.


This is really important because the collection step will happen quickly. We previously ran an experiment with 30+ subjects and the collection was done in one day. As you can imagine, since we hadn’t set up our analysis pipelines, only after the fact did we realize that we hadn’t recorded the trials that we needed to answer our question. In this case, we had to re-collect the data. Please learn from our mistakes and have these scripts ready before the bulk of the data is collected. This step also allows us to verify the preferences we are using for the collection.


In Visual3D, we automate the pipelines to run at the top level (where TMBatch saves out the pose files), and since we run them as we go, it will often run analysis on multiple subjects. This is possible within their framework, and just requires some folder wildcards to be set up correctly according to the protocol and the files that are being collected.



Once all of the Visual3D analysis is finished (and I haven’t written too much about this - but this will take a bit of time to get correct so be patient here), we often review data in bulk using Inspect3D. This tool is really powerful to review your entire data set, and allows you to clean up bad events in the files (as an aside - I don’t think I have collected a single experiment that had only clean events), as well as visualize the results you are interested in across subjects. Like Visual3D, this can be setup beforehand using Queries, which effectively define the signals of interest during particular event sequences. We recommend also setting this up ahead of time, however it isn’t as critical as the pipeline in Visual3D as they can be built out quickly after the fact.



In general, this is not rocket science. I often preach that the organization of the protocol will define the ease with which actionable information can be obtained, as well as actually reviewing data to ensure that all data recorded is useful.


To summarize, we recommend that you:

  1. Have a clean check list of the trials

  2. Run a calibration and spot check before the participant arrives

  3. Analyze one trial before the participant leaves

  4. Make sure to pilot the protocol beforehand

  5. Format data and add calibration using the tools in TMBatch

  6. Assign preferences in TMBatch based on the protocol

  7. Have scripts ready ahead of time for analysis (Visual3D)

  8. Bulk review data to ensure consistency of events (Inspect3D)


We hope this information is helpful and leads to many great data collections that avoid frustrating, easily-avoidable mistakes.


To download the data collection checklist that accompanies this post or reach out for a demo, click here.

539 views0 comments

Join our mailing list

Thanks for subscribing!

bottom of page