I want to find all repeating pattersn with start with 1-2degits, contains some text, strange scharacters and numbers and finishes with 4 digits.
For example, - ref=""" 1. Lieber, C. M. The incredible shrinking circuit. Sci. Am. 285, 58±64 (2001).
-
2. Cui, Y. & Lieber, C. M. Functional nanoscale electronic devices assembled using silicon nanowire
-
building blocks. Science 291, 851±853 (2001).
-
3. Wang, J. F., Gudiksen, M. S., Duan, X. F., Cui, Y. & Lieber, C. M. Highly polarized photolumi-
-
nescence and photodetection from single indium phosphide nanowires. Science 293, 1455±1457
-
(2001)."""
-
-
quer="\d{1,2}\.\s+.+\(\d{4}\)\."
-
ansre=re.findall(quer,ref,re.DOTALL)
-
print(len(ansre))
-
It fineds only one pattern.
If i use
i find 23 patterns, but of course this captures only the beginning of the pattern, i.e. 1-2 digits.
If i add .+, i am finding only one pattern:
I do not know how to mark that any character can be between 1-2 and 4 numbers in the patters. If i use .+ to mark this, when regex captures full text, it does not stop at 4 digits.
Thank you!
5 1395
You need to define what you mean by "contains some text" and "strange scharacters". Those terms are too vague and I have no idea what you mean by them.
But if your goal is to break out those references. Then you can use this - \d{1,2}\.[^(]+\(\d{4}\)\.
This is assuming that your test data is representative and there are never any parentheses except for the ones around the year at the end of the reference.
The problem is that there can be parenthesis in the text.
I do not know how to define "any character" between
and
. If i use
it captures all text, but i need to capture as many lines as there are starting with
and ending with
I have tried this: - \d+\.\s+[\sa-zA-Z0-9±&,:;.)(-]+
but it captures only 3 lines of 21 (Number 3, 11, 15 from the text below). DO not know why. Full text to be searched looks like this: - ref="""
-
1. Haraguchi, K., Katsuyama, T., Hiruma, K. & Ogawa, K. GaAs p-n junction formed in quantum wire
-
crystals. Appl. Phys. Lett. 60, 745 747 (1992).
-
2. Björk, M. T. et al. Nanowire resonant tunneling diodes. App. Phys.Lett. 81, 4458±4460 (2002).
-
3. Thelander, C. et al. Single electron transistors in heterostructure nanowires. Appl. Phys. Lett. 83,
-
2052±2054 (2003).
-
4. Wagner,R. S.in Whisker Technology (ed. Levitt, A. P.) 47±119 (Wiley. New York, 1970)
-
5. Hiruma, K. et al. Growth and optical properties of nanometer scale GaAs and InAs whiskers. J. Appl.
-
Phys. 77, 447±462 (1995).
-
6. Duan, X. & Lieber, C. M. General synthesis of compound semiconductor nanowires. Adv. Mater. 12,
-
298±302 (2000).
-
7. Duan, X. & Lieber, C. M. Laser assisted catalytic growth of single crystal GaN nanowires. J. Am. Chem.
-
Soc. 122, 188±189 (2000).
-
8. Ohlsson, B. J. et al. Size , shape , and position controlled GaAs nano whiskers. Appl. Phys. Lett. 79,
-
3335±3337 (2001).
-
9. Kamins, T. I., Stanley Williams, R., Basile, D. P., Hesjedal, T. & Harris, J. S. Ti catalyzed Si nanowires by
-
chemical vapor deposition: Microscopy and growth mechanisms. J. Appl. Phys. 89, 1008±1016 (2001).
-
10. Ohlsson, B. J. et al. Growth and characterization of GaAs and InAs nano whiskers and InAs/GaAs
-
heterostructures. Physica E 13, 1126±1130 (2002).
-
11. Buffat, P. & Borel, J. P. Size effect on the melting temperature of gold particles. Phys. Rev. A 13,
-
2287±2298 (1976).
-
12. Björk, M. T. et al. One dimensional steeplechase for electrons realized. Nano Lett. 2, 87±89 (2002).
-
13. Gudiksen, M. S., Lauhon, L. J., Wang, J., Smith, D. S. & Lieber, C. M. Growth of nanowires superlattice
-
structures for nanoscale photonics and electronics. Nature 415, 617±620 (2002).
-
14. Wu, Y., Fan, R. & Yang, P. Block by block growth of single crystalline Si/SiGe superlattice nanowires.
-
Nano Lett. 2, 83±86 (2002).
-
15. Baker, R. T. K. Catalytic growth of carbon filaments. Carbon 27, 315±323 (1989).
-
16. Helveg, S. et al. Atomic scale imaging of carbon nanofibre growth. Nature 427, 426±429 (2004).
-
17. Massalski, T. B. (ed.) Binary Alloy Phase Diagrams 2nd edn Vol. 1 369±371 (ASM International,
-
Materials Park, Ohio, 1990).
-
18. Gupta, R. P., Khokle, W. S., Wuerfl, J. & Hartnagel, H. L. Diffusion of gallium in thin gold films on
-
GaAs. Thin Solid Films. 151, L121±L125 (1987).
-
19. Massalski, T. B. (ed.) Binary Alloy Phase Diagrams 1st edn Vol. 1 191±192 (ASM International, Metals
-
Park, Ohio, 1986).
-
20. Magnusson, M. H., Deppert, K., Malm, J. O., Bovin, J. O. & Samuelson, L. Gold nanoparticles:production, reshaping, and thermal charging. J. Nanoparticle Res. 1, 243±251 (1999).
-
21. Bakkers, E. & Verheijen, M. A. Synthesis of InP nanotubes. J. Am. Chem. Soc. 125, 3440±3441 (2003). """
I find out why regex captured lines 3,11,15. Because they contains /
If i use - \d+\.\s+[\sa-zA-Z0-9/±&,:;.)(öÖäÄåÅ-]*\d{4}\)\.
all text is captured, the same as if i would be using
SO, question remains , how to define any character between other two sets of characters? In my case i want to capture 21 lines.
By default, regex uses greedy matching. That's why it doesn't stop after the first occurrence of a match. You need to tell it be be non-greedy.
You can do that by putting a ? after the qualifiers: * . + ? Sign in to post your reply or Sign up for a free account.
Similar topics
by: Henri Schomäcker |
last post by:
Hi folks,
I am developing a apache2 so module in c++.
At the moment, I'm trying to get it to compile with automake & friends, but
don't get it to work. I tried to modify the example in the...
|
by: Stephen |
last post by:
I was wondering if someone could please help me with an array I'd like to
create in an asp.net page. I have to design an array which stores the values
of addresses manually entered into textboxes...
|
by: JoseTA |
last post by:
Anybody Could please help this...
MDIParent Form:
-------------------
Event ButtonClick(strKey As String)
Private Sub tbrMain_ButtonClick(ByVal Button _
As MSComctlLib.Button)
RaiseEvent...
|
by: Christopher Walsh |
last post by:
Hello, I am having problems with one of the java scripts on my website
and I was hoping that maybe someone could tell me what I'm doing wrong
and how to fix my problem.
The problem is that I...
|
by: Dave |
last post by:
I am working on an access 2000 DB for some tree-growers that will be
storing items in a heirarchy of locations. The items will obviously
be stored at the lowest level in the heirarchy (in a row)...
|
by: Fritz Switzer |
last post by:
I've got some strings I'd like to regex.split. Any ideas on what the
format would be for these examples. I'm webscraping so I have no control on
the inputs.
A couple points: the POS can be...
|
by: albert_reade |
last post by:
Hello I was wondering if someone could please help me understand what I
need to do in order to get this project to work. I just need some hints
or a push in the right direction to get this to work,...
|
by: sklett |
last post by:
I have an Intel hex file I need to parse. I want to run a regex on each
line to get the separate sections.
the format is like this:
:llaaaattcc
where:
: - starts the record
ll - is the length...
|
by: tawright915 |
last post by:
Ok so here is my regex (--.*\n|/\*(.|\n)*?\*/). It finds all comments
just fine. However I want it to return to me all strings that are not
commented out. Is there a way to exclude the comments...
|
by: raubana |
last post by:
I wanna make a game called planetiod where you create planets and try not to blow them up, but i'm having a hard time with it. If you could, please tell me if you could help me with some modules or...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
|
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new...
| |