473,387 Members | 1,465 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Extracting information from String (regex, Beanshell)

11
Hello,

I am working with Taverna to build a workflow. Taverna has a beanshell where I can program in java. I am having some problems in writing a script. I want to extract information from a string, separated by newline. For this i am using regex.

The String is given:

P48534
EXP value is: e-10
Q0543
EXP value is: 4e-07


My script look like this in Beanshell:

Expand|Select|Wrap|Line Numbers
  1.  import java.util.regex.Matcher;
  2. import java.util.regex.Pattern;
  3.  
  4. Pattern pGI = Pattern.compile("(^.*?$)");
  5. Pattern pEvaLue = Pattern.compile("is: (.*)$");
  6. Matcher mGI;
  7. Matcher mEvalue;
  8. StringBuffer temp = new StringBuffer();
  9. String [] line = BlastReport.split("\n");
  10. int arraysize = line.length; 
  11.  
  12. for (int i=0; i<(arraysize-1); i+=2){
  13.     String sGI = line[i];
  14.     String sEvalue = line[i+1];    
  15.     mGI = pGI.matcher(sGI);
  16.     mEvalue = pEvalue.matcher(sEvalue);
  17.     String gi="";
  18.  
  19.     if (mGI.find()){
  20.         gi =mGI.group(1);
  21.     }
  22.     if (mEvalue.find()){
  23.         String eval = mEvalue.group(1);
  24.         if(eval.startsWith("e")){
  25.             eval= "1".concat(eval);
  26.         }
  27.         Double d = new Double (eval);
  28.         double Evalue = d.doubleValue();
  29.         if (Evalue<=0.02){
  30.             temp.append(gi + "\n");
  31.         }
  32.     }
  33. }
  34.  
  35. String result = temp.toString().trim();

The error message is: "Attempt to resolve method: matcher() on undefined variable or class name: pEvalue: at Line: 16: pEvalue .matcher(sEvalue)"


Can someone tell me why is giving me this error and how can i fix it.

Thank you in advance,

Mokita
Aug 6 '07 #1
10 11779
prometheuzz
197 Expert 100+
Hello,

I am working with Taverna to build a workflow. Taverna has a beanshell where I can program in java. I am having some problems in writing a script. I want to extract information from a string, separated by newline. For this i am using regex.

The String is given:

P48534
EXP value is: e-10
Q0543
EXP value is: 4e-07

...
It looks like you're trying to capture the text after the is:, right?
Try this:
Expand|Select|Wrap|Line Numbers
  1.  String text = "P48534\nEXP value is: e-10\nQ0543\nEXP value is: 4e-07";
  2. Pattern pattern = Pattern.compile("(?<=is:\\s)[^\\n]+");
  3. Matcher matcher = pattern.matcher(text);
  4.  
  5. while(matcher.find()) {
  6.   System.out.println(matcher.group());
  7. }
Aug 6 '07 #2
r035198x
13,262 8TB
Which one is your line 16?
Aug 6 '07 #3
Mokita
11
My line 16 is:
mEvalue = pEvalue.matcher(sEvalue);

Mokita
Aug 6 '07 #4
r035198x
13,262 8TB
My line 16 is:
mEvalue = pEvalue.matcher(sEvalue);

Mokita
Java is case sensitive.
Now give yourself a kick.
Aug 6 '07 #5
prometheuzz
197 Expert 100+
My line 16 is:
mEvalue = pEvalue.matcher(sEvalue);

Mokita
That splitting of your String and increasing with 2 in your for loop looks dangerous. Perhaps I (or someone else) can suggest a better approach. But then you first need to explain what is you're trying to do.

So given the String:
Expand|Select|Wrap|Line Numbers
  1. "P48534\nEXP value is: e-10\nQ0543\nEXP value is: 4e-07"
what is it you're trying to extract and/or group?
Aug 6 '07 #6
prometheuzz
197 Expert 100+
My line 16 is:
mEvalue = pEvalue.matcher(sEvalue);

Mokita
Try this:
Expand|Select|Wrap|Line Numbers
  1. String text = "P48534\nEXP value is: e-10\nQ0543\nEXP value is: 4e-07";
  2. Pattern pattern = Pattern.compile("([A-Z]\\d+).*\\n?.*((?<=is:\\s)[^\\n]+)");
  3. Matcher matcher = pattern.matcher(text);
  4.  
  5. System.out.println(text+"\n");
  6.  
  7. while(matcher.find()) {
  8.   String id = matcher.group(1);
  9.   String sVal = matcher.group(2);
  10.   sVal = sVal.startsWith("e") ? 1+sVal : sVal;
  11.   double dVal = Double.parseDouble(sVal);
  12.   System.out.println(id+"\t"+dVal);
  13. }
Aug 6 '07 #7
Mokita
11
That splitting of your String and increasing with 2 in your for loop looks dangerous. Perhaps I (or someone else) can suggest a better approach. But then you first need to explain what is you're trying to do.

So given the String:
Expand|Select|Wrap|Line Numbers
  1. "P48534\nEXP value is: e-10\nQ0543\nEXP value is: 4e-07"
what is it you're trying to extract and/or group?

I am trying to do a workflow with taverna, which has a beanshell, where i can write a script in it.
In my workflow i will have the output of a blast search, GI number, which are the P23234 or Q12344 or A12443 or only numbers and also a E-value, which is the EXP vaule is: e-10.
From that output i want to extract the GI numbers which have an e-value<= 0.02. The way i thought i could extract was with regular expressions in java, but the way i wrote the script it is not working.

I think it is more clear what i want to do, but if you need more explanation please ask.

Mokita
Aug 6 '07 #8
r035198x
13,262 8TB
I am trying to do a workflow with taverna, which has a beanshell, where i can write a script in it.
In my workflow i will have the output of a blast search, GI number, which are the P23234 or Q12344 or A12443 or only numbers and also a E-value, which is the EXP vaule is: e-10.
From that output i want to extract the GI numbers which have an e-value<= 0.02. The way i thought i could extract was with regular expressions in java, but the way i wrote the script it is not working.

I think it is more clear what i want to do, but if you need more explanation please ask.

Mokita
I don't think regex is best for this (I could be wrong of course). You are not searching for a pattern (which is where I mostly use my regex) but you are searching for numbers within some range.
P.S I hope you managed to correct the spelling mistake for that variable.
Aug 6 '07 #9
Mokita
11
Hello

I want to thank you for your help, it is working. I also want to ask you a quick question.
How can i change group 1 ([A-Z]\\d+) to catch: P00DF3 or 1234653 or Q5647GJD or A4658DF
It is not catching the ones with letters in the midle.

Thank you again,

Mokita
Aug 6 '07 #10
prometheuzz
197 Expert 100+
...
How can i change group 1 ([A-Z]\\d+) to catch: P00DF3 or 1234653 or Q5647GJD or A4658DF
It is not catching the ones with letters in the midle.

Thank you again,

Mokita
Try replacing "([A-Z]\\d+)" with "(\\w+)"
Aug 6 '07 #11

Sign in to post your reply or Sign up for a free account.

Similar topics

5
by: Markus Ernst | last post by:
Hello I have a regex problem, spent about 7 hours on this now, but I don't find the answer in the manual and googling, though I think this must have been discussed before. I try to simply...
3
by: Stephan Bour | last post by:
I have a string ≥Name≤ in the following format: ≥LastName, FirstName (Department)≤ that comes from Active Directory. I need to extract the FirstName from the string. Substrings are not practical...
1
by: John Seeliger | last post by:
I am pretty new to VB, so please forgive the simplistic question. This is using VB .NET Standard 2003. My form has three objects on it: a TextBox named URL, a Button named Extract and a...
1
by: detho | last post by:
Hi everyone... I am currently developing a streaming movie front-end for my companies' intranet. The script is done with PHP on the back-end and HTML and CSS for the front-end. Basically, the...
3
by: | last post by:
I'm analyzing large strings and finding matches using the Regex class. I want to find the context those matches are found in and to display excerpts of that context, just as a search engine might....
4
by: Mokita | last post by:
Hello, I am working with Taverna to build a workflow. Taverna has a beanshell where I can program in java. I am having some problems in writing a script, where I want to eliminate the duplicates...
10
by: Dan | last post by:
I have a number of strings that represents time. 1w 2d 3h 15m 2d 3h 15m 4h 30m 45m I want to extract the number parts of my strings into separate variables for Weeks, Days, Hours and...
11
by: Ebenezer | last post by:
Let's suppose I have some nodes in an XML file, with an URL attribute: <node url="mypage.php?name1=value1&foo=bar&foo2=bar2&name2=value0" /> <node...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.