User not logged in - login - register
Home Calendar Books School Tool Photo Gallery Message Boards Users Statistics Advertise Site Info
go to bottom | |
 Message Boards » » regex help Page [1]  
llama
All American
841 Posts
user info
edit post

I'm not a n00b when it comes to regex, but I haven't been able to figure this one out. This is for VIM syntax highlighting, and the regex that vim uses is basic and somewhat similar to perl.

I have a string like the following:

::blah::111::something::what I want

As you can probably guess, I want to match on whatever's in place of "what I want".
"blah", "111", "something", and "what I want" will always be different, so I can't just match on the string itself

This will match on the whole string and capture what I want in the 4th group, but I don't believe there's any way in vim for me to use this out side of search&replace:

\(::\w\+\)\(::\d\+\)\(::\w\+\)\(::.*\)


I'll also note that
\(::.*?\)$

does not work as vim doesn't support this lazy/non-greedy syntax like perl does

Something like
\(::.*$\)\{-}

doesn't work either


This is for matching strings in vim to configuring syntax highlighting for a filetype that isn't already defined. If someone's done this before and thinks there's a better way to do it inside the syntax file I'm all ears.

2/11/2011 7:24:26 PM

AstralEngine
All American
3864 Posts
user info
edit post

Quote :
"7. Syntax patterns *:syn-pattern* *E401* *E402*

In the syntax commands, a pattern must be surrounded by two identical
characters. This is like it works for the ":s" command. The most common to
use is the double quote. But if the pattern contains a double quote, you can
use another character that is not used in the pattern. Examples:
:syntax region Comment start="/\*" end="\*/"
:syntax region String start=+"+ end=+"+ skip=+\\"+

See |pattern| for the explanation of what a pattern is. Syntax patterns are
always interpreted like the 'magic' option is set, no matter what the actual
value of 'magic' is. And the patterns are interpreted like the 'l' flag is
not included in 'cpoptions'. This was done to make syntax files portable and
independent of 'compatible' and 'magic' settings.

Try to avoid patterns that can match an empty string, such as "[a-z]*".
This slows down the highlighting a lot, because it matches everywhere.


*:syn-pattern-offset*
The pattern can be followed by a character offset. This can be used to
change the highlighted part, and to change the text area included in the
match or region (which only matters when trying to match other items). Both
are relative to the matched pattern. The character offset for a skip
pattern can be used to tell where to continue looking for an end pattern.

The offset takes the form of "{what}={offset}"
The {what} can be one of seven strings:

ms Match Start offset for the start of the matched text
me Match End offset for the end of the matched text
hs Highlight Start offset for where the highlighting starts
he Highlight End offset for where the highlighting ends
rs Region Start offset for where the body of a region starts
re Region End offset for where the body of a region ends
lc Leading Context offset past "leading context" of pattern

The {offset} can be:

s start of the matched pattern
s+{nr} start of the matched pattern plus {nr} chars to the right
s-{nr} start of the matched pattern plus {nr} chars to the left
e end of the matched pattern
e+{nr} end of the matched pattern plus {nr} chars to the right
e-{nr} end of the matched pattern plus {nr} chars to the left
{nr} (for "lc" only): start matching {nr} chars to the left

Examples: "ms=s+1", "hs=e-2", "lc=3".

Although all offsets are accepted after any pattern, they are not always
meaningful. This table shows which offsets are actually used:

ms me hs he rs re lc
match item yes yes yes yes - - yes
region item start yes - yes - yes - yes
region item skip - yes - - - - yes
region item end - yes - yes - yes yes

Offsets can be concatenated, with a ',' in between. Example:
:syn match String /"[^"]*"/hs=s+1,he=e-1

some "string" text
^^^^^^ highlighted

Notes:
- There must be no white space between the pattern and the character
offset(s).
- The highlighted area will never be outside of the matched text.
- A negative offset for an end pattern may not always work, because the end
pattern may be detected when the highlighting should already have stopped.
- Before Vim 7.2 the offsets were counted in bytes instead of characters.
This didn't work well for multi-byte characters, so it was changed with the
Vim 7.2 release.
- The start of a match cannot be in a line other than where the pattern
matched. This doesn't work: "a\nb"ms=e. You can make the highlighting
start in another line, this does work: "a\nb"hs=e.

Example (match a comment but don't highlight the /* and */):
:syntax region Comment start="/\*"hs=e+1 end="\*/"he=s-1

/* this is a comment */
^^^^^^^^^^^^^^^^^^^ highlighted

A more complicated Example:
:syn region Exa matchgroup=Foo start="foo"hs=s+2,rs=e+2 matchgroup=Bar end="bar"me=e-1,he=e-1,re=s-1

abcfoostringbarabc
mmmmmmmmmmm match
sssrrreee highlight start/region/end ("Foo", "Exa" and "Bar")


Leading context *:syn-lc* *:syn-leading* *:syn-context*

Note: This is an obsolete feature, only included for backwards compatibility
with previous Vim versions. It's now recommended to use the |/\@<=| construct
in the pattern.

The "lc" offset specifies leading context -- a part of the pattern that must
be present, but is not considered part of the match. An offset of "lc=n" will
cause Vim to step back n columns before attempting the pattern match, allowing
characters which have already been matched in previous patterns to also be
used as leading context for this match. This can be used, for instance, to
specify that an "escaping" character must not precede the match:

:syn match ZNoBackslash "[^\\]z"ms=s+1
:syn match WNoBackslash "[^\\]w"lc=1
:syn match Underline "_\+"

___zzzz ___wwww
^^^ ^^^ matches Underline
^ ^ matches ZNoBackslash
^^^^ matches WNoBackslash

The "ms" offset is automatically set to the same value as the "lc" offset,
unless you set "ms" explicitly.



Multi-line patterns *:syn-multi-line*

The patterns can include "\n" to match an end-of-line. Mostly this works as
expected, but there are a few exceptions.

When using a start pattern with an offset, the start of the match is not
allowed to start in a following line. The highlighting can start in a
following line though. Using the "\zs" item also requires that the start of
the match doesn't move to another line.

The skip pattern can include the "\n", but the search for an end pattern will
continue in the first character of the next line, also when that character is
matched by the skip pattern. This is because redrawing may start in any line
halfway a region and there is no check if the skip pattern started in a
previous line. For example, if the skip pattern is "a\nb" and an end pattern
is "b", the end pattern does match in the second line of this:
x x a
b x x
Generally this means that the skip pattern should not match any characters
after the "\n".



External matches *:syn-ext-match*

These extra regular expression items are available in region patterns:


*/\z(* */\z(\)* *E50* *E52*
\z(\) Marks the sub-expression as "external", meaning that it is can
be accessed from another pattern match. Currently only usable
in defining a syntax region start pattern.


*/\z1* */\z2* */\z3* */\z4* */\z5*

\z1 ... \z9 */\z6* */\z7* */\z8* */\z9* *E66* *E67*
Matches the same string that was matched by the corresponding
sub-expression in a previous start pattern match.

Sometimes the start and end patterns of a region need to share a common
sub-expression. A common example is the "here" document in Perl and many Unix
shells. This effect can be achieved with the "\z" special regular expression
items, which marks a sub-expression as "external", in the sense that it can be
referenced from outside the pattern in which it is defined. The here-document
example, for instance, can be done like this:
:syn region hereDoc start="<<\z(\I\i*\)" end="^\z1$"

As can be seen here, the \z actually does double duty. In the start pattern,
it marks the "\(\I\i*\)" sub-expression as external; in the end pattern, it
changes the \1 back-reference into an external reference referring to the
first external sub-expression in the start pattern. External references can
also be used in skip patterns:
:syn region foo start="start \(\I\i*\)" skip="not end \z1" end="end \z1"

Note that normal and external sub-expressions are completely orthogonal and
indexed separately; for instance, if the pattern "\z(..\)\(..\)" is applied
to the string "aabb", then \1 will refer to "bb" and \z1 will refer to "aa".
Note also that external sub-expressions cannot be accessed as back-references
within the same pattern like normal sub-expressions. If you want to use one
sub-expression as both a normal and an external sub-expression, you can nest
the two, as in "\(\z(...\)\)".

Note that only matches within a single line can be used. Multi-line matches
cannot be referred to.

"


http://vimdoc.sourceforge.net/htmldoc/syntax.html#:syn-pattern

There's a way to do just about anything in vim. That link will teach you how to make a syntax file, and the selection I posted is specifically for setting up syntax highlights for user defined patterns. This way you can save the file once and whenever you open it in vim you'll get the highlights you want.

good luck

2/14/2011 11:41:01 AM

neolithic
All American
706 Posts
user info
edit post

http://xkcd.com/208/

2/14/2011 11:50:46 AM

Shaggy
All American
17820 Posts
user info
edit post

here are some lazy ways to do it.

whatuwant=s.substring(s.lastindexof("::")+2)


s2[] = s.split("::")
whatuwant=s2[s2.length-1];

[Edited on February 14, 2011 at 12:22 PM. Reason : 1]

2/14/2011 11:56:16 AM

Stein
All American
19842 Posts
user info
edit post

::[^:]+::[^:]+::[^:]+:[^\n]+)

2/15/2011 9:27:59 PM

lewisje
All American
9196 Posts
user info
edit post

::[^:]+::[^:]+::[^:]+::([^\n]+)

2/15/2011 9:32:19 PM

llama
All American
841 Posts
user info
edit post

Quote :
"http://vimdoc.sourceforge.net/htmldoc/syntax.html#:syn-pattern

There's a way to do just about anything in vim. That link will teach you how to make a syntax file, and the selection I posted is specifically for setting up syntax highlights for user defined patterns. This way you can save the file once and whenever you open it in vim you'll get the highlights you want.

good luck"

Ya, I've got the whole syntax file built except for this small part of the string. syn-pattern-offset may be what I need, thanks. I'll have to play around with it a bit. I've been sick, so I haven't had a chance to look at it.


Shaggy, thanks, it would be extremely easy to do it perl/python/etc., but can't be done like that in vim.


::[^:]+::[^:]+::[^:]+:[^\n]+)

is basically what I posted above and matches the whole string and not just the section I want, but thanks

2/16/2011 5:43:36 PM

Stein
All American
19842 Posts
user info
edit post

What do you mean it matches the whole string but not the section you want?

2/16/2011 6:02:21 PM

BigMan157
no u
103354 Posts
user info
edit post

([^:]+)$ maybe

assuming there are no colons in what u want

[Edited on February 16, 2011 at 6:05 PM. Reason : if you rotate your head to the side it kinda looks like a happy pope]

2/16/2011 6:04:41 PM

Shaggy
All American
17820 Posts
user info
edit post

oh. i'd rather kill myself than use vim.

2/16/2011 6:52:45 PM

llama
All American
841 Posts
user info
edit post

^^ well shit, that works when there's no colons in there. unfortunately, many lines have one or more additional colons in the section I want now that I think about it. fail on my part there

2/16/2011 6:58:33 PM

smoothcrim
Universal Magnetic!
18966 Posts
user info
edit post

^^

2/16/2011 8:36:18 PM

AstralEngine
All American
3864 Posts
user info
edit post

bitches, the both of you. vim is amazing

2/18/2011 10:14:08 AM

scrager
All American
9481 Posts
user info
edit post

can you use look ahead in VIM?

2/18/2011 1:33:47 PM

AstralEngine
All American
3864 Posts
user info
edit post

define "look ahead"

2/18/2011 3:55:10 PM

scrager
All American
9481 Posts
user info
edit post

http://www.regular-expressions.info/lookaround.html

I can't think of how to apply this at the moment, but I bet it is probably what you need. Though if VIM doesn't support the non-greedy, then it probably won't support lookaround either.

2/18/2011 9:55:11 PM

cdubya
All American
3046 Posts
user info
edit post

Quote :
"bitches, the both of you. vim is amazing"


QFT. That said, all of my 'cool' friends give me shit for using vim instead of emacs, so I guess its a slippery slope.

2/21/2011 2:58:23 AM

scud
All American
10804 Posts
user info
edit post

Quote :
"oh. i'd rather kill myself than use vim."


alt-meta-butterfly

2/21/2011 8:47:59 PM

lewisje
All American
9196 Posts
user info
edit post

^^^That site is basically the online version of the Help file for the best-regarded regex program in the business: http://www.regexbuddy.com/index.html

by an outfit called Just Great Software; too bad it's not free, but some free alternatives are given here: https://secure.wikimedia.org/wikipedia/en/wiki/RegexBuddy

2/24/2011 10:08:09 PM

simonn
best gottfriend
28968 Posts
user info
edit post

i'd rather kill myself than anything other than vim.

2/25/2011 8:55:37 AM

 Message Boards » Tech Talk » regex help Page [1]  
go to top | |
Admin Options : move topic | lock topic

© 2024 by The Wolf Web - All Rights Reserved.
The material located at this site is not endorsed, sponsored or provided by or on behalf of North Carolina State University.
Powered by CrazyWeb v2.39 - our disclaimer.