[Solved] Importing large number of dbf files

Discuss the database features
Post Reply
fergus
Posts: 32
Joined: Wed Nov 26, 2008 3:34 am

[Solved] Importing large number of dbf files

Post by fergus »

I have approximately 1,000 dbf files that I want to combine and make into one stand alone Database. There will be many duplicate records and I will also want to be finding and deleting these.

I am going to create a Base file with a table that has the exact corresponding fields to what the dbf files contain (all the dbf files have exactly the same fields).

Is there an easy way to import and append the dbf files to the newly created Base file or is it going to be a labour intensive exercise (which is the impression I get after having a look at the help files and elsewhere) ?

Also, can finding and deleting the duplicate records be done easily enough in Base ?
Last edited by fergus on Tue Oct 27, 2009 8:26 am, edited 2 times in total.
OOo 3.3.0 on MS Windows Vista
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Importing large number of dbf files

Post by Villeroy »

It's a matter of approximately 90 minutes to copy 1000 files into one file manually using OOo Base and drag&drop as the only tool set.
Using Windows you could utilize an ODBC driver for dBase which comes with Windows ODBC. That driver might support UNION SELECT which returns a set of unique records from merged tables. Compose such a query in a text editor from a simple dump of a file listing (something like DIR *.dbf > dump.txt ).
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
fergus
Posts: 32
Joined: Wed Nov 26, 2008 3:34 am

Re: Importing large number of dbf files

Post by fergus »

I haven't really used Base before (I am a Calc user though) and can't work out how to do the drag and drop you say (from the dbf files to my newly created Base file).

Also, what happens with the primary key ? I tried copying and pasting a dbf file into the newly created Base file but it wouldn't do it because it said I had no data to paste in the Primary Key field (I matched up the dbf fields with their corresponding fields in the Base file and this left the Primary Key field without any data to be pasted in it but it said the Primary Key field has to have data pasted in it)
OOo 3.3.0 on MS Windows Vista
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Importing large number of dbf files

Post by Villeroy »

This is what I tested:
1. From some document I get the beamer window (F4), right-click datasource "Bibliography" and open the database document which configures the connection to some dBase directory of my user profile (see statusbar: "dBase /some/directory/path/").
2. It has a single table "biblio". I dragged this a little and when I drop it in the same area, I get prompted to create a copy of it with definition and data to "biblio2". In dBase databases the primary key is not relevant.
3. Now I select table "biblio", Ctrl+C, move down to biblio2, Ctrl+V and I'll be prompted to append to the selected "biblio2". Since both tables have the same structure and types of data, I hit Enter which triggers the finishing [Create] button
Repeat: move up to biblio, Ctrl+C, down, Ctrl+V, Enter. Now I have 3 copies of biblio merged in biblio2.
All you've got to add is hitting the Del-key to delete the copied table and hit keys until all the tables are merged in one.
Of course an SQL script would be more elegant, but this sequence of key sgtrokes takes just a few seconds per table.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Importing large number of dbf files

Post by Villeroy »

Sorry, I forgot about the duplicates. A query
SELECT DISTINCT * FROM "Your Table"
should deliver what you need.
Currently it is impossible to enforce unique entries by means of dBase indices: http://www.openoffice.org/issues/show_bug.cgi?id=98701
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
fergus
Posts: 32
Joined: Wed Nov 26, 2008 3:34 am

Re: Importing large number of dbf files

Post by fergus »

I followed your instructions (using the Bibliography data source) and am more than happy with that process.

I was also able to create the duplicate query and that works as well.

I want to end up with just one table that has no duplicates in it.

How do I go about this ? Once I have appended all the dbf files into one big file \ table and have created the query to delete the duplicate entries, do I then copy the query to the Table area (thus making it a table ?) and do I then delete the original big file \ table, so that I am just left with a Table with only the records that I want ?
OOo 3.3.0 on MS Windows Vista
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Importing large number of dbf files

Post by Villeroy »

I'm not shure how to delete duplicates or copy distinct records in dBase database Base, since dBase support is incomplete, undocumented and partially broken (but nevertheless useful). I tried some SQL commands with INSERT INTO and SELECT INTO without success.
If you want to use dBase with another database application, that other tool should be able to do this easily.
In Base you can simply ignore duplicates in dBase using the above mentioned query instead of the table.

Using Base as the only tool set, I see no way to remove duplicates other than this:
Copy your dBase table with duplicates from the dBase-connected database into a fresh one, newly created from scratch using the built-in HSQLDB engine. Copy definition and data, don't bother about primary key.
Call menu:Tools>SQL and execute

Code: Select all

drop table "biblio3" if exists;
SELECT DISTINCT * INTO "biblio3" FROM "biblio2";
ALTER TABLE "biblio3" ADD COLUMN "ID" INT GENERATED BY DEFAULT AS IDENTITY BEFORE "Identifier";
"biblio2" was my example where I created the duplicates from the original "biblio" and from the copy in the HSQLDB I select only distinct records into a new table biblio3. The result is identical to my original "biblio" in the other database.
Since we disregarded the primary key, there is no way to edit the new table in a HSQLDB. Therefore I add an ID field with an automatically increasing record number as primary key before the field named "Identifier" (the position of a database field is completely unimportant, but I know that you will ask if the new field can have a position other than the last one).

In the table's index designer you can define additional keys so the table won't accept any douplicates in future.
Next version OOo3.2 will re-enable this important option for dBase tables again.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
fergus
Posts: 32
Joined: Wed Nov 26, 2008 3:34 am

Re: Importing large number of dbf files

Post by fergus »

Thanks for your continued help. I may not have explained myself all that well in my previous post.

With your biblio example, you have created a table (biblio2) with the duplicate records in it and a query (Query1) to remove the duplicates.

To remove the duplicates completely, I copied Query1 and then switched to the Tables area and went to Edit - Paste. This prompted me to create a definition and data copy of Query1 which I named biblio3.

I saved all this work and then deleted biblio2 from the Tables area and deleted Query1 from the Queries area, leaving me with just the biblio3 table (with the required no duplicates).

Doing this doesn’t seem to present me with any problems. Is there any reason I shouldn’t do it this way ?
OOo 3.3.0 on MS Windows Vista
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Importing large number of dbf files

Post by Villeroy »

Doing this doesn’t seem to present me with any problems. Is there any reason I shouldn’t do it this way ?
I missed this simple and elegant solution. I used to drag around queries from Base into office documents and somehow I knew that you can copy query results within Base, but I did not take this into consideration. Thank you. :bravo:
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
fergus
Posts: 32
Joined: Wed Nov 26, 2008 3:34 am

Re: Importing large number of dbf files

Post by fergus »

Thanks for the compliment, Villeroy. I’m glad to be of some help.

Thanks for the assistance on this problem. I haven’t really used Base before and have a better understanding of how some of it works now.
OOo 3.3.0 on MS Windows Vista
Post Reply