Formatting Data for Use With ebFRET Hidden Markov Modeling Software

Clark Fritsch

Nov 09, 2022

Formatting Data for Use With ebFRET Hidden Markov Modeling Software

This protocol is a draft, published without a DOI.

Clark Fritsch¹

¹University of Pennsylvania

Clark Fritsch

Johns Hopkins University

Protocol Citation: Clark Fritsch 2022. Formatting Data for Use With ebFRET Hidden Markov Modeling Software. protocols.io https://protocols.io/view/formatting-data-for-use-with-ebfret-hidden-markov-civaue2e

License: This is an open access protocol distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

Protocol status: Working

We use this protocol and it's working

Created: November 05, 2022

Last Modified: November 09, 2022

Protocol Integer ID: 72322

Abstract

This protocol follows from the "Selecting a Region of Interest for Hidden Markov Modeling using ebFRET" protocol and is the second step towards analyzing your data using Hidden Markov Modeling. This protocol describes the steps you need to take to convert traces containing your region of interest (selected in the previous protocol) and formatting them for use with ebFRET, a software package that is capable of performing Hidden Markov Modeling on single-molecule FRET datasets.

This protocol is a continuation of the "Selecting a Region of Interest for Hidden Markov Modeling using ebFRET" practice protocol.

To begin this protocol, you should have your "SavedData.csv" file and traces in one directory, as shown below:

It is likely that when you are doing your real analysis that you don't include every trace that is included in your "GoodOnes.txt" file in your "SavedData.csv" file. For example, you might think that a trace is good at first and include it in your "GoodOnes.txt" file, but find upon closer inspection that the trace is not worthy of further analysis and therefore exclude it from your "SavedData.csv" file.

Because of this, you want to separate the traces that are included in your "SavedData.csv" file from the remainder of your traces.

To separate the traces included in your "SavedData.csv" file from the remainder of your traces, you can use the "SavedData_Copy_and_Transfer_v3.R" program that is attached below:

SavedData_Copy_and_Transfer_v3.R  

To use this program in R, simply copy the path to the directory that contains all of your traces and the "SavedData.csv" file and paste the path into the "path_var_old" variable in between the parentheses, as highlighted below:

Then press "Ctrl A" to highlight all of the code and press "Ctrl Enter" to run the code.

After running the code, you will see that the following information has been outputted to the console (boxed in yellow):

This output simply shows that each trace (TRUE) has been transferred successfully to the output directory. If a trace is not transferred successfully, it was output FALSE. Additionally, your "SavedData.csv" file is transferred to the output directory.

After running the program successfully, you will see that the "SavedData" output directory has been created at the path directory that you inputted into the program:

This "SavedData" directory will contain only your "SavedData.csv" file and the traces that were contained within your "SavedData.csv" file, as shown below:

Next, you want to cut out the parts of each of your traces that are not included within the range set by Click 1 and Click 2 that is included in your "SavedData.csv" file.

You can do this using the "Click1_Click2_Cut.R" program that is included below:

 Click1_Click2_Cut.R

The "Click1_Click2_Cut.R" program, as described above, simply takes the values for Click1 (leftmost boundary of your ROI) and Click2 (rightmost boundary of your ROI) and removes the frames from each trace that are not included between Click1 and Click2.

To use this program in R, simply copy the path to "SavedData" directory and paste the path into the "path_var" variable in between the parentheses, as highlighted below:

Then press "Ctrl A" to highlight all of the code and press "Ctrl Enter" to run the code.

Once you have run the "Click1_Click2_Cut.R" program, a new directory called "T1_to_T2_Synchronized" that will contain your new traces, each of which has been cut to contain only the frames between your Click1 and Click2 values:

Each of these new "trimmed" traces can be distinguished from the original traces because the trimmed traces contain "-T1_to_T2_Synchronized" appended to the end of each filename:

Now that you have your trimmed traces prepared, you need to format the traces so that they can be run using the ebFRET Hidden Markov Modeling program.

To do this, you can use "RegionOfInterest_to_ebFRET.R" program that is attached below:

RegionOfInterest_to_ebFRET.R  

To use ebFRET, your traces of interest need to be stitched together with each trace being associated with a unique ID number. Additionally, the donor and acceptor fluorophore intensity for each timepoint in each trace needs to be included in the file.

To use the "RegionOfInterest_to_ebFRET.R" program to format your traces in this way, simply copy the path that contains your trimmed traces into the "path" variable in the program, as highlighted below:

Additionally, you need to name the formatted file that your trace information will be outputted to by entering the new filename into the "new_file_name" variable, as shown below:

Note that the new file name must have the ".dat" extension, as this is the only text file format that ebFRET will recognize.

Press "Ctrl A" to highlight all of the code and press "Ctrl Enter" to run the code.

Once you have run the program, your ebFRET formatted file will appear in your "T1_to_T2_Synchronized" directory. It is now ready to input into ebFRET for Hidden Markov Analysis.

Public workspaceFormatting Data for Use With ebFRET Hidden Markov Modeling Software

Formatting Data for Use With ebFRET Hidden Markov Modeling Software