Quantcast
Channel: Recent Questions - Stack Overflow
Viewing all articles
Browse latest Browse all 11601

Regex string parsing: pattern starts with ; but can end with [;,)%&@]

$
0
0

I am attempting to parse strings using Regex. The strings look like:

Stack;O&verflow;i%s;the;best!

I want to parse it to:

Stack&verflow%sbest!

So when we see a ; remove everything up until we see one of the following characters: [;,)%&@] (or replace with empty space "").

I am using re package in Python:

string = re.sub('^[^-].*[)/]$', '', string)

This is what I have right now:

^[^;].*[;,)%&@]

Which as I understand it says: starting at the pattern with ;, read everything that matches in between ; and [;,)%&@] characters

But the result is wrong and looks like:

Stack;O&verflow;i%s;the;

Demo here.

What am I missing?

EDIT: @InSync pointed out that there is a discrepancy if ; is in the end characters as well. As worded above, it should result inStack&verflow%s**;**best! but instead I want to see Stack&verflow%sbest!. Perhaps two regex lines are appropriate here, I am not sure; if you can get to Stack&verflow%s**;**best! then the rest is just simple replacement of all the remaining ;.

EDIT2: The code I found that works was

import redef remove_semicolons(name):    name = re.sub(';.*?(?=[;,)%&@])', '', name)    name = re.sub(';','',name)    return nameremove_semicolons('Stack;O&verflow;i%s;the;best!')

Or if you feel like causing a headache to the next programmer who looks at your code:

import resemicolon_string = 'Stack;O&verflow;i%s;the;best!'cleaned_string = re.sub(';','',re.sub(';.*?(?=[;,)%&@])', '', semicolon_string))

Viewing all articles
Browse latest Browse all 11601

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>