how to convert gpr file to csv format: using python

baber

Hi
I am a beguinner, I would like to known how to convert a file in gpr format to csv format by using python.
Baber

Jan 11 '07 #1

Subscribe Post Reply

7460

bartonc

6,596

Expert 4TB

Hi
I am a beguinner, I would like to known how to convert a file in gpr format to csv format by using python.
Baber

Very well. Let's move this to the Python forum. Welcome to TSDN.

Jan 11 '07 #2

bartonc

6,596

Expert 4TB

Hi
I am a beguinner, I would like to known how to convert a file in gpr format to csv format by using python.
Baber

Welcome to the Python Forum on TheScipts.com.
I don't recognize gpr. Is it some other text format or from a program?

Jan 11 '07 #3

ghostdog74

511

Expert 256MB

well, you should help use to help you, by providing an example of gpr format, and your expected output, in which case, you are requiring csv.
looking up the gpr extension, i can only find that it relates to some modeling software system...

Jan 12 '07 #4

bartonc

6,596

Expert 4TB

well, you should help use to help you, by providing an example of gpr format, and your expected output, in which case, you are requiring csv.
looking up the gpr extension, i can only find that it relates to some modeling software system...

Hey ghostdog! Where you been so long?
I actually found the GenePix Results format, but don't know if this is the correct one:

Expand|Select|Wrap|Line Numbers

 GPR Header

A sample GPR file header and a description of each entry are shown below: 
 
Entry Description 
 
ATF     1.0 File type and version number. 

29       48 Number of optional header records and

number of data fields (columns). 

"Type=GenePix Results 3" Type of ATF file. 

"DateTime=2002/02/09 17:15:48" Date and time when the image was acquired. 

"Settings=C:\Genepix\Genepix.gps" The name of the settings file that was used for analysis. 

"GalFile=C:\Genepix\Demo.gal" The GenePix Array List file used to associate Names and IDs to each entry. 

"PixelSize=10" Resolution of each pixel in µm. 

"Wavelengths=635     532" Installed laser excitation sources in nm. 

"ImageFiles=C:\Genepix\demo.tif 0

C:\Genepix\Genepix.tif 1" The name and path of the associated TIF file(s). 

"NormalizationMethod=None" The type of normalization method used, if applicable. 

"NormalizationFactors=1    1" The normalization factor applied to each channel. 

"JpegImage=C:\Genepix\demo.jpg" The name and path of the associated Jpeg image files. 

"StdDev=Type 1" The type of standard deviation calculation selected in the Options settings. 

"RatioFormulation=W1/W2 (635/532)" The ratio formulation of the ratio image, showing which image is numerator and which is denominator. 

"Barcode=00331" The barcode symbols read from the image. 

"BackgroundSubtraction=LocalFeature" The background subtraction method selected in the Options settings. 

"ImageOrigin=0, 0" The origin of the image relative to the scan area. 

"JpegOrigin=390, 4320" The origin of the Results JPEG image (the bounding box of the analysis Blocks) relative to the scan area origin. 

"Creator=GenePix 4.1.1.4" The version of the GenePix Pro software used to create the Results file. 

"Scanner=GenePix 4000B [serial number]" Type and serial number of scanner used to acquire the image. 

"FocusPosition=0" The focus position setting used to acquire the image, in microns. 

"Temperature=19.6127" The temperature of the scanner, in degrees C. 

"LinesAveraged=1" The line average setting used to acquire the image. 

"Comment=hyb 2673" User-entered file comment. 

"PMTGain=500     600" The PMT settings during acquisition. 

"ScanPower=100    100" The amount of laser transmission during acquisition. 

"LaserPower=1    1" The power of each laser, in volts. 

"LaserOnTime=5    5" The laser on-time for each laser, in minutes. 

"Filters=<Empty>    <Empty>" Emission filters used during acquisition (GenePix 4100 and 4200 only.) 

"ScanRegion=100,100,2000,2000" The coordinate values of the scan region used during acquisition, in pixels. 

"Supplier=" Header field supplied in GAL file. 

Data record column headings Column titles for each measurement (see below). 

Data Records Extracted data. 
 
GPR Data

The list below describes each column of data in the Results file. 
 
Column Title Description 
 
Block the block number of the feature. 

Column the column number of the feature. 

Row the row number of the feature. 

Name the name of the feature derived from the Array List (up to 40 characters long, contained in quotation marks). 

ID the unique identifier of the feature derived from the Array List (up to 40 characters long, contained in quotation marks). 

X the X-coordinate in µm of the center of the feature-indicator associated with the feature, where (0,0) is the top left of the image. 

Y the Y-coordinate in µm of the center of the feature-indicator associated with the feature, where (0,0) is the top left of the image. 

Dia. the diameter in µm of the feature-indicator. 

F635 Median median feature pixel intensity at wavelength #1 (635 nm). 

F635 Mean mean feature pixel intensity at wavelength #1 (635 nm). 

F635 SD the standard deviation of the feature pixel intensity at wavelength #1 (635 nm). 

B635 Median the median feature background intensity at wavelength #1 (635 nm). 

B635 Mean the mean feature background intensity at wavelength #1 (635 nm). 

B635 SD the standard deviation of the feature background intensity at wavelength #1 (635 nm). 

% > B635 + 1 SD the percentage of feature pixels with intensities more than one standard deviation above the background pixel intensity, at wavelength #1 (635 nm). 

% > B635 + 2 SD the percentage of feature pixels with intensities more than two standard deviations above the background pixel intensity, at wavelength #1 (635 nm). 

F635 % Sat. the percentage of feature pixels at wavelength #1 that are saturated. 

F532 Median median feature pixel intensity at wavelength #2 (532 nm). 

F532 Mean mean feature pixel intensity at wavelength #2 (532 nm). 

F532 SD the standard deviation of the feature intensity at wavelength #2 (532 nm). 

B532 Median the median feature background intensity at wavelength #2 (532 nm). 

B532 Mean the mean feature background intensity at wavelength #2 (532 nm). 

B532 SD the standard deviation of the feature background intensity at wavelength #2 (532 nm). 

% > B532 + 1 SD the percentage of feature pixels with intensities more than one standard deviation above the background pixel intensity, at wavelength #2 (532 nm). 

% > B532 + 2 SD the percentage of feature pixels with intensities more than two standard deviations above the background pixel intensity, at wavelength #2 (532 nm). 

F532 % Sat. the percentage of feature pixels at wavelength #2 that are saturated. 

Ratio of Medians the ratio of the median intensities of each feature for each wavelength, with the median background subtracted. 

Ratio of Means the ratio of the arithmetic mean intensities of each feature for each wavelength, with the median background subtracted. 

Median of Ratios the median of pixel-by-pixel ratios of pixel intensities, with the median background subtracted. 

Mean of Ratios the geometric mean of the pixel-by-pixel ratios of pixel intensities, with the median background subtracted. 

Ratios SD the geometric standard deviation of the pixel intensity ratios. 

Rgn Ratio the regression ratio of every pixel in a 2-feature-diameter circle around the center of the feature. 

Rgn R² the coefficient of determination for the current regression value. 

F Pixels the total number of feature pixels. 

B Pixels the total number of background pixels. 

Sum of Medians the sum of the median intensities for each wavelength, with the median background subtracted. 

Sum of Means the sum of the arithmetic mean intensities for each wavelength, with the median background subtracted. 

Log Ratio log (base 2) transform of the ratio of the medians. 

Flags the type of flag associated with a feature. 

Normalize the normalization status of the feature (included/not included). 

F1 Median - B1 the median feature pixel intensity at wavelength #1 with the median background subtracted. 

F2 Median - B2 the median feature pixel intensity at wavelength #2 with the median background subtracted. 

F1 Mean - B1  the mean feature pixel intensity at wavelength #1 with the median background subtracted. 

F2 Mean - B2 the mean feature pixel intensity at wavelength #2 with the median background subtracted. 

SNR 1 the signal-to-noise ratio at wavelength #1, defined by (Mean Foreground 1- Mean Background 1) / (Standard deviation of Background 1) 

F1 Total Intensity the sum of feature pixel intensities at wavelength #1 

Index the number of the feature as it occurs on the array. 

"User Defined" user-defined feature data read from the GAL file (GenePix Pro 4.1).

Jan 12 '07 #5

ghostdog74

511

Expert 256MB

hey barton
i've been lurking around :-)...
anyway, thanks for the gpr format. if its correct, then now its up to OP to specify his requirements. :)

Jan 12 '07 #6

baber

hey barton
i've been lurking around :-)...
anyway, thanks for the gpr format. if its correct, then now its up to OP to specify his requirements. :)

This example of gpr file is a good one.
gpr format (microarray data file) is like this:

Description
line 1
line 2
Line n

col1 col2 ..... coln
line1 val1 val2 valn
line2 etc etc
line3 etc

Now, I want know how to convert gpr to csv with python ?

Jan 16 '07 #7

bvdet

2,851

Expert Mod 2GB

This example of gpr file is a good one.
gpr format (microarray data file) is like this:

Description
line 1
line 2
Line n

col1 col2 ..... coln
line1 val1 val2 valn
line2 etc etc
line3 etc

Now, I want know how to convert gpr to csv with python ?

If I understand this format correctly, it is a tab delimited file. The script below will replace each tab with a comma and output to another file:

Expand|Select|Wrap|Line Numbers

 import os
 
def tab_to_csv(tab_name, csv_name):

    try:

        f1 = open(tab_name, 'r')

        f2 = open(csv_name, 'w')

        outList = []

        for line in f1:

            outList.append(line.replace('\t', ','))

        f1.close()

        f2.writelines(outList)

        f2.close()

        return True

    except:

        return False
 
if __name__ == '__main__':
 
    def run_script():
 
        gpr_file = (os.path.join('H:\\', 'TEMP', 'temsys', 'GPR.gpr'))

        csv_file = (os.path.join('H:\\', 'TEMP', 'temsys', 'GPR.txt'))

        if tab_to_csv(gpr_file, csv_file):

            print 'Tab delimited file conversion to comma delimited file was successful'

        else:

            print 'There was an error'

    run_script()

Jan 16 '07 #8

bvdet

2,851

Expert Mod 2GB

Here's some more information I found on the gpr format:

ATF - Axon Text File format (*.atf)

ATF is a tab-delimited text file format that can be read by typical spreadsheet programs such as Microsoft Excel. It is used for GenePix Array List (GAL) files, and GenePix Results (GPR) files.

An ATF text file consists of records. Each line in the text file is a record. Each record may consist of several fields, separated by a field separator (column delimiter). The tab and comma characters are field separators. Space characters around a tab or comma are ignored and considered part of the field separator. Text strings are enclosed in quotation marks to ensure that any embedded spaces, commas and tabs are not mistaken for field separators.

The group of records at the beginning of the file is called the file header. The file header describes the file structure and includes column titles, units, and comments.

It would be great if baber could provide us with a sample gpr file so we could test it.

Jan 16 '07 #9

dshimer

136

Expert 100+

1) This looks like a very straightforward text file in which you could read in all the lines, create a list of each line, evaluate the list based on their contents the just write it back out delimited by commas.

That said, I'll admit I'm still a bit confused by the format. Does this imply that each line "line 1" etc, is comprised of a bunch of data organized in columns? Or that there are N lines containing something, then a string of n entries of "col" data, followed by further strings of value data? In any case I can think of several ways to easily read and analyze the data, I just am not totally clear on what is being described.

This example of gpr file is a good one.
gpr format (microarray data file) is like this:

Description
line 1
line 2
Line n

col1 col2 ..... coln
line1 val1 val2 valn
line2 etc etc
line3 etc

Now, I want know how to convert gpr to csv with python ?

Jan 16 '07 #10

bartonc

6,596

Expert 4TB

This example of gpr file is a good one.
gpr format (microarray data file) is like this:

Description
line 1
line 2
Line n

col1 col2 ..... coln
line1 val1 val2 valn
line2 etc etc
line3 etc

Now, I want know how to convert gpr to csv with python ?

So this IS GenePix, right?

Jan 17 '07 #11

ghostdog74

511

Expert 256MB

This example of gpr file is a good one.
gpr format (microarray data file) is like this:

Description
line 1
line 2
Line n

col1 col2 ..... coln
line1 val1 val2 valn
line2 etc etc
line3 etc

Now, I want know how to convert gpr to csv with python ?

i don't really know what is your desired output, but by specifying csv, i guessed you just want a comma separated. Here's a bit of code

Expand|Select|Wrap|Line Numbers

 
import fileinput

for line in fileinput.FileInput("file",inplace=1):

   print ','.join(line.split())

>>>

output:

Expand|Select|Wrap|Line Numbers

 
line,1

line,2

Line,n
 
col1,col2,.....,coln

line1,val1,val2,valn

line2,etc,etc

line3,etc

Jan 17 '07 #12

bvdet

2,851

Expert Mod 2GB

i don't really know what is your desired output, but by specifying csv, i guessed you just want a comma separated. Here's a bit of code

Expand|Select|Wrap|Line Numbers

import fileinput

for line in fileinput.FileInput("file",inplace=1):

print ','.join(line.split())

>>>

output:

Expand|Select|Wrap|Line Numbers

line,1

line,2

Line,n

col1,col2,.....,coln

line1,val1,val2,valn

line2,etc,etc

line3,etc

It works except as indicated below. Before:

Expand|Select|Wrap|Line Numbers

 ATF    1            

8    5            

Type=GenePix ArrayList V1.0                

BlockCount=4                

BlockType=0                

URL=http://genome-www.stanford.edu/cgi-bin/dbrun/SacchDB?find+Locus+%22[ID]%22                

"Block1= 400, 400, 100, 24, 175, 5, 175"                

"Block2= 4896, 400, 100, 24, 175, 5, 175"                

"Block3= 400, 4896, 100, 24, 175, 5, 175"                

"Block4= 4896, 4896, 100, 24, 175, 5, 175"                

Block    Column    Row    Name    ID

1    1    1    VPS8    YAL002W

1    2    1    NTG1    YAL015C

After:

Expand|Select|Wrap|Line Numbers

 ATF,1

8,5

Type=GenePix ArrayList V1.0

BlockCount=4

BlockType=0

URL=http://genome-www.stanford.edu/cgi-bin/dbrun/SacchDB?find+Locus+%22[ID]%22

"Block1= 400, 400, 100, 24, 175, 5, 175"

"Block2= 4896, 400, 100, 24, 175, 5, 175"

"Block3= 400, 4896, 100, 24, 175, 5, 175"

"Block4= 4896, 4896, 100, 24, 175, 5, 175"

Block,Column,Row,Name,ID

1,1,1,VPS8,YAL002W

1,2,1,NTG1,YAL015C

To prevent duplicate commas at embedded spaces, strip trailing tab and newline characters and split on tabs:

Expand|Select|Wrap|Line Numbers

 for line in fileinput.input(gpr_file, True, '.bak'):

   print ','.join(line.rstrip('\t\n').split('\t'))

Good post ghostdog. I did not know about fileinput.

Jan 17 '07 #13

baber

Thanks a lot, now I can convert .gpr to .csv.

Baber

Jan 22 '07 #14

bartonc

6,596

Expert 4TB

Thanks a lot, now I can convert .gpr to .csv.

Baber

Awesome! Thanks for the update.

Jan 23 '07 #15

vijayachitra

well, you should help use to help you, by providing an example of gpr format, and your expected output, in which case, you are requiring csv.
looking up the gpr extension, i can only find that it relates to some modeling software system...

hi friends
i want to know how to get .gpr file (microarray data files) and how to run the file in matlab....
plz help me as soon as possible.i need if for my project...........

Mar 20 '07 #16

bvdet

2,851

Expert Mod 2GB

hi friends
i want to know how to get .gpr file (microarray data files) and how to run the file in matlab....
plz help me as soon as possible.i need if for my project...........

Hello vijayachitra,

I don't know how to get GPR files. You can probably find some sample files on the internet. You have not given us enough information about what data you need to parse from a GPR file. Since you have found this thread, you can see that information can easily be extracted, but what information and in what format? How about this from our example:

Expand|Select|Wrap|Line Numbers

 import re

def readBlockData(fn):

    dd = {}

    fList = open(fn).readlines()

    for line in fList:

        line = line.strip('"\n\t')

        if re.match('Block\d', line):

            tem = line.split('=')

            dd[tem[0]] = [int(i) for i in tem[1].strip().split(', ')]

    return dd           
 
if __name__ == '__main__':
 
    dd = readBlockData('your_file))

    for key in dd:

        print '%s = %s' % (key, dd[key])
 
'''

Block4 = [4896, 4896, 100, 24, 175, 5, 175]

Block3 = [400, 4896, 100, 24, 175, 5, 175]

Block2 = [4896, 400, 100, 24, 175, 5, 175]

Block1 = [400, 400, 100, 24, 175, 5, 175]

'''

Mar 21 '07 #17

by: Ben Kial | last post by:

Is there a Python program to convert Windows long filename, like "c:\Program Files" into the old DOS 8.3 format, like "c:\Progra~1"? Thanks in advance, Ben

Python

Convert to big5 to unicode

by: GM | last post by:

Dear all, Could you all give me some guide on how to convert my big5 string to unicode using python? I already knew that I might use cjkcodecs or python 2.4 but I still don't have idea on what...

Python

read dat file into python and convert it to xcel format?

by: ravibantu | last post by:

Hi guys, I am a newbie to python. Does anybody know how i can read a dat file into python and convert into xcel format? Thank you ravi

Python

Convert to binary and convert back to strings

by: Harlin Seritt | last post by:

Hi... I would like to take a string like 'supercalifragilisticexpialidocius' and write it to a file in binary forms -- this way a user cannot read the string in case they were try to open in...

Python

Convert binary file

by: Vamp4L | last post by:

Hello, Specifically, I'm trying to convert the Internet Explorer history file (index.dat) into a readable format. Anyone done something similar or know of any functions that may help with such a...

Python

Separate output for log file and stdout

by: amit.uttam | last post by:

Hey everyone, I've recently jumped big time into python and I'm working on a software program for testing automation. I had a question about proper logging of output. What I would like is: 1....

Python

Convert Word .doc to Acrobat .pdf files

by: Dinil Karun | last post by:

hi, I am using the below code but i am getting a error saying pyUno module not found. can u please help. Regards Dinil ...

Python

how to convert a video file in .flv format in php for linux hosting

by: sonu | last post by:

hey good morning ...... how to convert a video file in .flv format in php for linux hosting......is there any package whis provide this facility . Can i use ffmpeg for linux hosting...

PHP

Convert any document to tif format

by: ashz | last post by:

Hi All, I want to convert any document to tiff file format. Is it possible using dot net. Actually i have a simple button in my form that open a OpenFileDialog control. Using this i get the...

.NET Framework

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...

C# / C Sharp

how to convert gpr file to csv format: using python

Similar topics