[Solved] Negative regular expression search for TAB

Discuss the word processor
Post Reply
rudolfo
Volunteer
Posts: 1488
Joined: Wed Mar 19, 2008 11:34 am
Location: Germany

[Solved] Negative regular expression search for TAB

Post by rudolfo »

I was first trying to do this in a macro, but did not get any
meaningful results. My first suspicion was, that this is a
problem of the Basic macro language.

But now I figured out, that it won't work in the standard
search/replace dialog, either. So here is what I try to do:

Search for any text between the beginning of the paragraph
and the first TAB. For TABs the same notation as in Perl or
awk is used: \t
Nice, so I don't have to remember new things.

But it doesn't work in an exclusion list:

^[^\t]+

If I have a line:

Point: TAB more text

It finds "Poin", so it stops at the first "t", that it finds.
Seems like [^\t] is interpreted as:
"Any character except backslash or the letter t"

A literal TAB is not working, cause it moves me out of
the text box.

Any ideas?

Okay and I should add the important information:

OpenOffice.org 2.3.1 on Windows 2000
Last edited by rudolfo on Fri Mar 28, 2008 12:42 am, edited 1 time in total.
OpenOffice 3.1.1 (2.4.3 until October 2009) and LibreOffice 3.3.2 on Windows 2000, AOO 3.4.1 on Windows 7
There are several macro languages in OOo, but none of them is called Visual Basic or VB(A)! Please call it OOo Basic, Star Basic or simply Basic.
User avatar
floris v
Volunteer
Posts: 4430
Joined: Wed Nov 28, 2007 1:21 pm
Location: Netherlands

Re: Negative regular expression search for TAB

Post by floris v »

Looks like a bug to me. :( It's broken in 2.4 as well.
All I can suggest is that you replace the tab character by another character, otherwise not used, do the replace with that character, then replace it by the tab character again. When that works, file this as a bug. :lol:

If this solves your problem, please edit the first post of this thread (Edit button) and add [Solved] to the subject line.
OpenOffice 4.1.11 on Ubuntu; LibreOffice 6.4 on Linux Mint, LibreOffice 7.6.2.1 on Ubuntu
If your problem has been solved or your question has been answered, please edit the first post in this thread and add [Solved] to the title bar.
Nederlandstalig forum
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: Negative regular expression search for TAB

Post by acknak »

Try entering the tab character in the "Search for" expression using the number pad: Alt+009

On Linux anyway, that bypasses the dialog's TAB-key event and just enters a literal tab character in the string. I don't know if it will work the same way in Windows, but it's easy to give it a try.

Well, I'm sure you can do it from the clipboard, you just have to select a tab from your document and copy it to the clipboard, then paste it into your pattern.
 Edit:  
PS: See Issue 54340: Can't specify tab in regular expression
AOO4/LO5 • Linux • Fedora 23
rudolfo
Volunteer
Posts: 1488
Joined: Wed Mar 19, 2008 11:34 am
Location: Germany

Re: Negative regular expression search for TAB

Post by rudolfo »

@acknak:
I gave it a try with pasting the TAB from the clipboard.
It doesn't work for me. It seemed to make it a single space
when inserting. It did mark the first word of the next paragraph
and when I did paste the search text back into my editor, it
was a space there.
 Edit: And on Windows Alt 009 doesn't bypass the dialog events.
It moves me in the replace text box. 
I also had a look at the Issue 54340. They recommend the
( |\t) as work around for [ \t] which is okay for a positiv
search, but it can't be used when excluding characters.

Anyway, at least it clearly states that the [...] syntax is
not meant to include multi-character symbols.

Maybe I follow the advice from Floris, although I feel that this
is something that you might do in a simple editor, but rather
not in a fully featured word processor.
On second thought I might opt for digging into the content.xml
and use vim to do the search and replace there.
OpenOffice 3.1.1 (2.4.3 until October 2009) and LibreOffice 3.3.2 on Windows 2000, AOO 3.4.1 on Windows 7
There are several macro languages in OOo, but none of them is called Visual Basic or VB(A)! Please call it OOo Basic, Star Basic or simply Basic.
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: [Solved] Negative regular expression search for TAB

Post by acknak »

That's too bad, I've used both those tricks many times on Linux. I guess OOo's keyboard handling is completely different on the two systems.

The current plan, as far as I understand it, is to replace OOo's current regular expression engine (never a good one) with a wrapper around a far better one that is already present in the standard libraries that OOo currently uses. So, at some point all these stupid problems with regexp syntax will go away instantly. When that "some point" might be is anyone's guess.
AOO4/LO5 • Linux • Fedora 23
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: [Solved] Negative regular expression search for TAB

Post by Villeroy »

^[^\t]+
Line starting with at least one non-tab?

Try this hack: ^[^\x0009]
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: [Solved] Negative regular expression search for TAB

Post by acknak »

Definitely worth a try, but I think the root problem is that (as far as OOo is concerned) the only meaning that a backslash has in a character class is to simply quote the following character. So I expect that hack will find lines beginning with one or more characters not including 'x' '0' or '9'. Hopefully I'm wrong.
AOO4/LO5 • Linux • Fedora 23
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: [Solved] Negative regular expression search for TAB

Post by Villeroy »

I tried actually. The hex-code regex works for me.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: [Solved] Negative regular expression search for TAB

Post by acknak »

Cool. I guess I just assumed that it wouldn't work, since that's what happened with the "\t" (I think?). I don't have OOo available to play with ATM; should've kept quiet ;-)

Nice workaround!
AOO4/LO5 • Linux • Fedora 23
huw
Volunteer
Posts: 417
Joined: Wed Nov 21, 2007 1:57 pm

Re: [Solved] Negative regular expression search for TAB

Post by huw »

According to my understanding, Villeroy's suggestion should actually select the first character of any paragraph that doesn't begin with a backslash, an 'x', a zero, or a nine, but that doesn't seem to be the case.

The original post is asking "select everything from the start of a paragraph up to, but not including, the first tab found". Using Villeroy's suggestion:

Code: Select all

^[^\x0009]+
If there isn't a tab then this expression will select the whole paragraph.
rudolfo
Volunteer
Posts: 1488
Joined: Wed Mar 19, 2008 11:34 am
Location: Germany

Re: [Solved] Negative regular expression search for TAB

Post by rudolfo »

huw wrote:The original post is asking "select everything from the start of a paragraph up to, but not including, the first tab found". Using Villeroy's suggestion:

Code: Select all

^[^\x0009]+
If there isn't a tab then this expression will select the whole paragraph.
As I did originally raise this issue, I should comment with an explanation,
what I tried to do -- though I am a bit amazed, that the discussion on this
is still going on despite of the label "solved". Don't missinterpret this as a
habit of dictatorship, I am actually quite happy to find find new suggestions ;)

I need to search for text from the beginning of a paragraph until the
first TAB in that paragraph, but without this final TAB (and only if a
TAB is in the paragraph). So in Perl, or egrep it would be

^([^\t]+)\t

To focus on the problem I left out the final TAB.

The intersting thing is that Villeroy's suggestion with the hexcode

^[^\x0009]+

did work for me, but only once. When I tried it a second time, it did
not find anything. Also on a second document it did not work anymore.
Even gave it a try with closing all OOo windows and starting it up again.
Still not working.

Another problem is that with OOo 2.3.x I can only work on the full match
of the search and using subgroups like \2 or $2 in the replacement text
won't work. (At least I interpret it like that if the Changelog of 2.4 says
that they now support grouping in regular expression search).
OpenOffice 3.1.1 (2.4.3 until October 2009) and LibreOffice 3.3.2 on Windows 2000, AOO 3.4.1 on Windows 7
There are several macro languages in OOo, but none of them is called Visual Basic or VB(A)! Please call it OOo Basic, Star Basic or simply Basic.
huw
Volunteer
Posts: 417
Joined: Wed Nov 21, 2007 1:57 pm

Re: [Solved] Negative regular expression search for TAB

Post by huw »

The regular expressions switch under find &replace more options is not sticky - you have to reenable it every time you open find & replace. I just tried ^[^\x0009]+ again and it still works for me, and I still think it shouldn't...

Backreferences, in the form $1, $2, etc. only work from 2.4 onwards in replace. Prior versions don't have back references working in replace at all. They do have them in the form \1, \2, etc. but only in find. I don't know what form they need to be in for find in 2.4 as I am not yet using it.
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: [Solved] Negative regular expression search for TAB

Post by acknak »

The syntax in the "Search for" pattern has not changed for 2.4, it's still \1, \2, ...
AOO4/LO5 • Linux • Fedora 23
rudolfo
Volunteer
Posts: 1488
Joined: Wed Mar 19, 2008 11:34 am
Location: Germany

Re: [Solved] Negative regular expression search for TAB

Post by rudolfo »

huw wrote:The regular expressions switch under find &replace more options is not sticky - you have to reenable it every time you open find & replace. I just tried ^[^\x0009]+ again and it still works for me, and I still think it shouldn't...
Yes, hm .. right, if I have a close look at the "regular expresssion" check box
and keep it checked the ^[^\x0009]+ works consistently all time on my system,
as well.

It's another thing I need to get used to. If I see the old complicated regular
expression as the search item when I open the Search & Replace a second time
I assume that OOo fully guesses what I want to do -- another regex search.
OpenOffice 3.1.1 (2.4.3 until October 2009) and LibreOffice 3.3.2 on Windows 2000, AOO 3.4.1 on Windows 7
There are several macro languages in OOo, but none of them is called Visual Basic or VB(A)! Please call it OOo Basic, Star Basic or simply Basic.
User avatar
acknak
Moderator
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: [Solved] Negative regular expression search for TAB

Post by acknak »

Yes, that quirk is a right pain.

Remember, you can leave the Find and Replace dialog open forever--as long as you like. I've just got in the habit of leaving it up when I'm doing this sort of surgery. There's always something more to mess up.

I have often wondered if it wouldn't be an improvement to designate a certain character string to mark a regular expression, so that there would be no need for the option (although it could be left in place). The traditional regexp delimiters are /.../, I don't see why that wouldn't work here as well: any "Search for" string that begins and ends with "/" would be treated as a regular expression, even if the option wasn't set.

It would make it a challenge to do a non-regexp search for anything that looked like that, e.g. searching for /a.b.c/ would have to be entered as /\/a\.b\.c\//.
AOO4/LO5 • Linux • Fedora 23
User avatar
floris v
Volunteer
Posts: 4430
Joined: Wed Nov 28, 2007 1:21 pm
Location: Netherlands

Re: [Solved] Negative regular expression search for TAB

Post by floris v »

It would make it a challenge to do a non-regexp search for anything that looked like that, e.g. searching for /a.b.c/ would have to be entered as /\/a\.b\.c\//.
Yeah... :ugeek: Uber geek alert. :lol: Find and replace is hard enough as it is in OOo. Let's not make it impossible. ;)
OpenOffice 4.1.11 on Ubuntu; LibreOffice 6.4 on Linux Mint, LibreOffice 7.6.2.1 on Ubuntu
If your problem has been solved or your question has been answered, please edit the first post in this thread and add [Solved] to the title bar.
Nederlandstalig forum
huw
Volunteer
Posts: 417
Joined: Wed Nov 21, 2007 1:57 pm

Re: [Solved] Negative regular expression search for TAB

Post by huw »

floris v wrote:
acknak wrote:It would make it a challenge to do a non-regexp search for anything that looked like that, e.g. searching for /a.b.c/ would have to be entered as /\/a\.b\.c\//.
Yeah... :ugeek: Uber geek alert. :lol: Find and replace is hard enough as it is in OOo. Let's not make it impossible. ;)
And don't forget how much ire is derected at OpenOffice's current non-standard regular expression implementation. Imagine the wrath at another one! And all directed at one person, acknak, someone who if their avatar is any guide, couldn't even open the door to get out of the room and flee for their life...
User avatar
foxcole
Volunteer
Posts: 1507
Joined: Mon Oct 08, 2007 1:31 am
Location: Minneapolis, Minnesota

Re: [Solved] Negative regular expression search for TAB

Post by foxcole »

acknak wrote:Remember, you can leave the Find and Replace dialog open forever--as long as you like. I've just got in the habit of leaving it up when I'm doing this sort of surgery.
Sometimes I wish it were dockable.
Cheers!
---Fox

OOo 3.2.0 Portable, Windows 7 Home Premium 64-bit
Post Reply