[Solved] Database engines

Discuss the database features
Post Reply
Liamdale
Posts: 55
Joined: Thu Apr 07, 2011 1:42 am
Location: Quebec, Canada

[Solved] Database engines

Post by Liamdale »

I've been reading posts on the OOoBase SQL database engine, HSQL. I've noticed that many use MYSQL. Are there many differences and advantages to changing to MYSQL.
How will OOoBase react.
Last edited by Liamdale on Sat Apr 16, 2011 5:42 am, edited 1 time in total.
LibreOffice vers 5.0 on Linux Ubuntu (English/French versions)
User avatar
r4zoli
Volunteer
Posts: 2882
Joined: Mon Nov 19, 2007 8:23 pm
Location: Budapest, Hungary

Re: Database engines

Post by r4zoli »

AOO 4.0 and LibO 4 on Win 8
Hungarian forum co-admin
FJCC
Moderator
Posts: 9248
Joined: Sat Nov 08, 2008 8:08 pm
Location: Colorado, USA

Re: Database engines

Post by FJCC »

Base is most useful as a bridge between a database and other OOo documents. I would say that the important thing is to not use the embedded database because of the risk of data loss. If OOo crashes while our are working on an embedded database, you can lose your data. You can separate the HSQLDB engine from the base file, or you can use any other DB engine, such as MySQL. The pro's and con's of various databases are discussed in this thread. It is a long thread, so you might want to search for the phrase "Pro's and Con's". There is an extension for Base that provides an easy connection to MySQL, which is another advantage, though not all that significant.
My advice is to choose the database engine based on your needs first, then worry about connecting Base to the DB. I have connected Base to a corporate Oracle DB and my own MySQL and HSQL databases and it all worked fine.
Be sure to read the thread on data loss in base: http://user.services.openoffice.org/en/ ... 83&t=17567
The linked threads are very useful.
OpenOffice 4.1 on Windows 10 and Linux Mint
If your question is answered, please go to your first post, select the Edit button, and add [Solved] to the beginning of the title.
User avatar
DACM
Volunteer
Posts: 1138
Joined: Tue Nov 03, 2009 7:24 am

Re: Database engines

Post by DACM »

This is probably a better location for the Pro/Con list...(please PM me with any comments, updates or suggestions)

Choose your own Back-End Database for Base:
If you're starting a new project in Base, I suspect that combining it with the latest HSQLDB 2.x or H2 database engine will take you as far as you'll want to go. You'll be hard-pressed to utilize either of these RDBMS's to their full potential, much less find them lacking in features or function. Their modern architectures promote ease-of-use on many levels, whereas legacy RDBMS's can overwhelm Base users with undue complexity and out-dated conventions requiring additional tools. As a bonus, these two Java-based engines are considerably faster than most other RDBMS's; relatively easy to install (a single 1MB file); they're FLOSS; cross-platform; USB flash-drive portable; and HSQLDB is particularly well-known and supported by the Base user-community. They also provide superior LOB support, which allows media-file storage inside the database while maintaining performance through mapped/indirect storage up to 64 Terabytes. And lastly, these engines feature "compatibility modes/settings" to mimic other popular RDBMS's (PostgreSQL, MySQL, Oracle, SQL Server, etc.) in their role as SQL prototyping tools for database professionals. This, and other free tools, can ease migration to/from virtually any SQL-compliant database engine...should it ever become necessary.

But there may be good reasons to skip HSQLDB and H2 altogether. If you're simply connecting Base to an existing database then you really don't have a choice. Otherwise, you'll need to evaluate the available back-end options based on tangibles and intangibles such as: personal experience, licensing, engine maturity, and other distinctives. This is an important step because migration between RDBMS's can be a difficult and tedious process; it's not just a matter of data migration. For instance, Stored Procedures are an extended feature of many RDBMS's (HSQLDB and H2 inclusive) whether procedural SQL or ubiquitous Java routines, but these routines don't always migrate easily between RDBMS's. And then there's the matter of scale, because it is possible to outgrow HSQLDB 2.x and H2 in multi-user environments. These Java engines simply aren't designed for large-scale projects involving more than a few dozen, concurrent, read-write users. Multi-user benchmarks suggest good performance up through 10 high-concurrency connections, which translates into perhaps 100 LAN users when leveraging the built-in multi-user support (MVCC and Connection Pooling). But these engines are relatively unproven as the user-count continues to grow, particularly in stateless connection environments (internet). And despite their support for huge databases (billions of records), relative-performance can suffer in operations such as batch-inserts involving millions of records (which I suspect is a memory-overhead issue related to the excessive size of Java-objects relative to their native counterparts). In any case, large-scale projects -- with hundreds of users, multi-million records, or a significant internet presence -- are better suited for other "back-end" candidates. MySQL (LAMP)/MariaDB (WnMp) (Wnmp) (XAMPP) (WTserver), PostgreSQL (LAPP) or Firebird (Morfik) are up to the task and won't break a sweat with 35+ data-intensive connections (hundreds of actual users) while approaching 1:1 scalability (specifically PostgreSQL) on multi-proc/core hardware. You may also be attracted to the vast MySQL/PostgreSQL/Firebird support communities and add-ons. But then again, these full-scale RDBMS's require separate management (beyond Base) and may not be as well-suited for small-to-medium size projects in a home, office or small-business where the superior speed, footprint, ease-of-use, seamless management, and Java-portability of HSQLDB or H2 remain favorable.

Some RDBMS Pro's and Con's related to Base:*

HSQLDB 1.8.x link; Tutorial: Base+HSQLDB setup
+ Built-into AOO/LibO; close integration and feature support
+ no installation dependencies beyond Java; Java is ubiquitous and installed with AOO/LibO as necessary
+ multi-user 'server-mode' support available but not enabled by default with AOO/LibO (hsqldb.jar)
+ much faster than most engines except perhaps H2; ~5x faster in 'file mode' (running "in-process"); ~2x faster in client-server mode
+ small footprint (1 file; 1MB); all-in-one driver and RDBMS-function; superior cross-platform design and easier to setup than SQLite
+ fully portable (Java/AOO/LibO/HSQLDB all run portably and cross-platform)
+ up to ten 8GB** databases per instance (per port and JRE); 4x that of MS Access (2GB limit) and sufficient for most projects
+ some good third party tools (JDBC based)
+ ACID (A,I, & D are disputed and may not meet ACID requirements)
+ the included SqlTool is a powerful, cross-platform, cross-database, command-line, SQL database administration tool; perfect for automation tasks
- OpenOffice integration is rather pathetic; the default "embedded database" file is susceptible to data loss upon AOO/LibO crash; this particular configuration also reduces default cache settings such that performance suffers as the database grows beyond ~3MB; in any case, Base offers a robust alternative configuration which separates the database files for native access through Base+HSQLDB running in 'file' or 'server' mode.
- some performance advantages are reduced or eliminated even in file/server modes with ample cache as query-results or batch inserts exceed a few million records
- relatively poor scalability (but at least 10 "high concurrency" connections equating to dozens of LAN users)
- relatively limited feature set; limited date math; poor CLOB/BLOB support -- thus there's little reason to use HSQLDB 1.8.x in light of HSQLDB 2.x which is much more powerful and now relatively mature.***

** HSQLDB 2.x supports an unlimited number of 256GB databases and a 64TB store for “large” text/binary (CLOB/BLOB). This vastly superior LOB support enables massive media-file storage without plug-ins or workarounds while maintaining performance.
*** HSQLDB 2.x is a major update; fully-threaded engine; ACID compliant; excellent BLOB/CLOB support and performance; MVCC (enhanced multi-user support); encrypted databases; full-featured SQL-2008 support including date math; SQL stored procedures; improved SqlTool applet, etc.

H2 1.4.x link; Tutorial: Base+H2 setup
+ "automatic mixed-mode" eases client-server setup and flexibility
+ no installation dependencies beyond Java; Java is ubiquitous and installed with AOO/LibO
+ relatively fast; often 5x faster (single-user mode) or 2x faster (multi-user mode) than other RDBMS's
+ small footprint (1 file; 1MB); all-in-one driver and RDBMS-function; superior cross-platform design and easier to setup than SQLite
+ fully portable (Java/AOO/LibO/H2 all run portably and cross-platform)
+ unlimited number of 4TB databases ("SPLIT" command overcomes OS file size limits using ~1 MB memory per 8GB file size)
+ unlimited number of 256GB “large” text/binary (CLOB/BLOB) files. This vastly superior LOB support enables massive media-file storage without plug-ins or workarounds while maintaining full database performance.
+ a built-in, cross-platform, cross-database, graphical, web-interface Console for SQL database-administration; simply double-click the all-in-one file: H2.jar
+ some good third party tools (JDBC based)
+ ACID (D is not technically supported, but H2's clustering support or a UPS or laptop may suffice)
+ relatively full featured: MVCC enhanced multi-user, encrypted databases, etc.
+ relatively new/advanced architecture for future-proofing; same original author as Hypersonic SQL (now HSQLDB)
+ superior linked table support
+ compatible with the PostgreSQL ODBC driver, in addition to JDBC connectivity
- not yet fully-threaded, or at least not when combined with MVCC support (multi-user scaling consideration)
- no internal SQL stored procedures; but supports external Java routines through triggers
- relatively poor scalability (but at least 10 “high concurrency” connections equating to dozens of LAN users)

MySQL 5.6.x Community Server link
+ relatively good scalability; good choice for popular Web sites (MyISAM or Aria engine)
+ unlimited database size support (limited by maximum file size)
+ great third-party tools and support
+ ACID compliance using the InnoDB, XtraDB, PBXT, Aria 2.x, BDB and Cluster engines
+ relatively full featured (beyond the now-depreciated MyISAM engine)
+ relatively new/advanced architecture for future-proofing; exceedingly popular
+ portable version or configuration
- relatively restrictive licensing could be a show-stopper for some projects; free if you release your code under GPL as well; GPL option may not be available for MySQL after 2015 (?); however the MariaDB fork is expected to remain GPL.
- unusually small row size limit: between 8KB and 64KB total depending on version
- known for speed, but benchmarks slower than HSQLDB and H2 as well
- benchmarks reveal scaling issues on multi-proc/core hardware, but InnoDB 1.1 engine addresses this issue
- too many SQL compliance issues; slowly improving
- plug-in engine architecture affects ability to confirm ACID compliance
- probably not the best choice for data warehousing (transactions) relative to PostgreSQL or Firebird, but now much improved with newer InnoDB, XtraDB and Aria engines
- new ownership may adversely affect development pace and/or extent; but the current GPL (licensing) should ensure the longevity and progression of MySQL (or true GPL forks such as MariaDB), in spite of ownership, for decades to come

Firebird 2.5.x link
+ good choice for embedding (1MB); and now available as an experimental feature of LibreOffice Base
+ relatively full-featured and competitive with top-end engines: PostgreSQL, Oracle, DB2 & SQL Server
+ excellent (free; IDPL) licensing terms !
+ great scalability (SMP "classic server" or single-process threaded "SuperServer"); good choice for data warehousing
+ unlimited database size support (limited by maximum file size)
+ good security
+ fully ACID compliant
+ Base & MS Access developers alike may be attracted to Morfik (free; add-on supported) which utilizes Firebird
- more mature than MySQL but not quite as optimized for web-server applications (cross-talk latency)
- practically requires the use of 3rd party tools beyond Base during initial Table/schema development
- relatively smaller support community than MySQL or PostgreSQL
- requires third-party documentation unlike MySQL or PostgreSQL

PostgreSQL 9.x link
+ unrivaled features among free RDBMS's and competitive with Oracle, DB2 & SQL Server
+ excellent (extremely flexible; free; BSD) licensing terms !
+ great scalability; superior SMP scaling; good choice for popular Web sites and data warehousing
+ unlimited database size support (limited by maximum file size)
+ great third-party tools and support; Oracle-compatible EnterpriseDB project
+ relatively full featured; advanced security
+ fully ACID compliant
+ portable version available ( PostgreSQL 8.3.7-1 with pgAdmin III v1.8.4 )
- difficult to compete with the speed of an embedded database (HSQLDB / H2 / SQLite) due to connection overhead
- relatively smaller support community than MySQL

* This simple Pro/Con list is not meant to ignite any flame wars between database experts. These bullets are presented exclusively for consideration relative to Base front-end projects or other OpenOffice data source access. OpenOffice itself is buggy and happy to freeze or crash on occasion. And while this won't affect data in client-server mode, its hardly a professional-grade solution on par with today's Smart-Client solutions.

Further reading:
http://www.h2database.com/html/features.html#comparison
http://www.wikivs.com/wiki/MySQL_vs_PostgreSQL
http://www.encorewiki.org/display/encor ... Comparison

A word about SQLite:
  • I scarcely mentioned SQLite, despite its popularity, wide-spread use, and [former] consideration for inclusion as the default engine for LibreOffice Base. SQLite is particularly popular among developers because it provides good SQL database function in a minimal footprint, without Java dependencies. Packaging such an engine as a code library is a boon for coders because it supplements the [non-persistent array] tools available to the programmer. If the built-in function of SQLite proves inadequate for the programmer, its no trouble to fill-in the necessary function with additional coding. And the difficulties associated with SQLite in multi-platform environments can be streamlined by the programmer with separate installations for each platform (perhaps similar to installing a JRE for JDBC database support).

    However, this does not translate into the best plug-in database engine for an end-user application like Base. The supposed 'ease' of SQLite is lost with Base, especially relative to HSQLDB or H2. SQLite's spotty SQL support and minimal functionality are not preferable in the Base environment. Base Forms and Reports rely heavily on SQL and therefore thrive in a full-featured SQL environment as provided by HSQLDB 2.x and H2. This is a key point because users will turn to macros when they can't find the necessary function within Base or the database engine. This will quickly turn the 'ease' of SQLite into serious 'difficulty' as users find themselves wrestling with the UNO API. Functionality as basic as data-typing is absent in SQLite, so a simple database incorporating a date or timestamp will immediately require macro-code or advanced table structures to parse fields for date/time-based searches. This point (data-typing) led the LibraOffice team to end their pursuit of SQLite as the default for Base. Now imagine the 'difficulty' of scaling a project that was initially developed in SQLite into a multi-user environment. It's a complete rewrite! Not so with HSQLDB and H2 which scale seamlessly to dozens of simultaneous-users, while adhering to ANSI-standard SQL should migration ever become an issue. And lets face it, there's nothing 'easier' than ANSI-standard SQL in today's environment of online SQL-tutorials...and there's nothing more SQL-compliant in the database world than HSQLDB and H2. Not even close with SQLite; even MySQL lacks proper standards compliance. Even the archaic version of HSQLDB (1.8.0.10) as currently delivered with AOO/LibO provides greater function, scalability and SQL-compliance than SQLite. And that's to say nothing of the revolutionary advancements found in the latest HSQLDB 2.x -- which is slated for inclusion in a future version of AOO/LibO (this work has been re-initiated as of Jan 2013).

    Java database engines:
    (1) provide their entire RDBMS function through an all-in-one JAR file, complete with built-in JDBC driver; overall superior to the design of SQLite which requires separate (often third-party) drivers;
    (2) provide standard JDBC connection across all platforms, unlike SQLite's native driver situation (SQLiteJDBC notwithstanding);
    (3) offer many more features and functions in similar memory footprints while performance (speed) is generally on par with SQLite;
    (4) H2 stores the database in a single file just like SQLite; Java databases are bit-compatible across platforms for ease of database portability
    (5) both HSQLDB and H2 are supported in portable application environments including portable Java;
    (6) and Java engines are a natural for Android as well (see H2 on Android) if that becomes an issue in the Base life-cycle.

    The Java dependencies and debugging issues cited by LibreOffice programmers are valid with respect to code developed and maintained by AOO/LibO developers. I understand that these internal bugs are difficult to isolate and fix due to the inter-dependent Java and UNO (C++) code within *Office. But the buggy code is isolated to unnecessary features that should be eliminated from from the *Office code-base to promote data-durability, including the SDBC driver for HSQLDB, and the associated 'new database' wizard that creates the embedded-database file format. Problem solved. But lets not throw-the-baby-out-with-the-bath-water! The necessary JDBC support code appears solid, and it provides a distinctive feature of Base. Leveraging this critical JDBC support, Base templates have eliminated the need for a default database in Base, while upgrading the engine, and maintaining portability. The associated JRE is also necessary for legacy 'HSQLDB Embedded' database support and migration. Besides, HSQLDB to Firebird database migration is an unrealistic goal, while easy Firebird database-management through Base is likely years away, and the current Firebird Embedded design is prone to data-corruption. Since HSQLDB is maintained by expert Java programmers, unrelated to *Office development, Java dependencies should not be an issue for AOO/LibO developers any more than the Python dependencies required by various extensions. If AOO/LibO does offer to install Java, and/or establish a path to the Java folder during install (or automatically upon unsuccessful Java access), then it's no issue at all for the end-user.

    Recent fears related to Java use in browsers has caused Apple to discontinue support for Java. And despite the immediate fixes to Java, Apple's reaction is understandable given Oracle's general apathy towards Java. But Apple's users are quite used to the closed computing environment necessary to claim "it just works." The inability to use the a default Java engine with Base on an Apple device is just one more application that Apple users will happily eschew in order to align their computing needs with Apple's propriety.

    So in my opinion, JDBC support should be maintained and the GUI ultimately refined in support of the traditional split-database configuration. This support is key for legacy database support and migration, and we can only presume that ODBC support would benefit as well from the ensuing GUI optimizations.

.
AOO 4.1.x; LO 4.2.x; Windows 7/8 64-bit
Warning: Avoid embedded databases --> Solution: Adopt a portable 'split database' folder
Soli Deo gloria
Post Reply