Top 10 reasons to migrate from Oracle to PostgreSQL

Time:2020-9-29

Author: Paul namug
Paul namuag has been able to hold various positions and has benefited from the opportunity to use various technologies over the past 18 years. He has been a graphics artist and Ms. net developer since 2005, moving to open source technology and a web developer using lamp stack. Later, he was a software engineer / game engineer who worked with several companies to develop mobile or desktop and web applications. In the second half of 2013, he became a MySQL support engineer, and then became a remote DBA of percona, so that he had the opportunity to understand how big data, high scalability, and high availability applications work.

preface

Oracle relational database management system (RDBMS) has been widely used by enterprises and is considered as the most advanced database technology in the market. In general, it is the most commonly used product in RDBMS to compare with other databases and serves as the standard “fact” that the product should provide. Db- engines.com It is rated as the first RDBMS in the market today. PostgreSQL is listed as the fourth RDBMS, but this does not mean that there is no advantage in migrating to PostgreSQL. PostgreSQL has existed since 1989 and opened its source code in 1996. PostgreSQL won the DBMS award of the year for two consecutive years from 2017 and 2018. This shows that it has never stopped attracting a large number of users and enterprises.

One of the reasons for PostgreSQL’s popularity is that people are looking for alternatives to Oracle so that they can cut down on the organization’s high costs and avoid vendor lock-in.

Migrating from an efficient Oracle database can be a daunting task. Concerns such as the company’s TCO (total cost of ownership) are one of the reasons for the company’s delay in deciding whether to abandon oracle.

In this blog, we will introduce the main reasons why some companies choose to leave Oracle and migrate to PostgreSQL.

Reason one: This is a real open source project

It’s similar to postgret, which is an open source license and is distributed under a free license to postgrid, or postgrid. There is no cost to get products and support.

If you want to use database software, it means that you can get all the functions of PostgreSQL database for free. PostgreSQL has a long history in database field for more than 30 years, and has been based on open source since 1996. For decades, developers have been working to create extended functionality. This in itself will enable developers, institutions and organizations to choose PostgreSQL for enterprise applications, supporting leading business and mobile applications.

Once again, enterprises realize that open source database solutions like Postgres provide more capacity, flexibility and support, and are not completely dependent on any company or developer. Like previous Linux, Postgres is designed (and will continue to be) by users committed to solving everyday business problems and choosing to return the solution to the community. Unlike large developers like Oracle, which may have different motivations to develop profitable products or support narrow but profitable markets, the Postgres community is committed to developing the best tools for users to use relational databases on a daily basis.

PostgreSQL usually performs these tasks without adding too much complexity. Its design is strictly focused on working with databases without wasting resources, such as adding functionality to manage other IT environments. This is one of the things when users of the open source software migrate from Oracle to PostgreSQL. Spend a lot of time studying complex techniques about how Oracle databases work or how to optimize and tune them, and you may end up with expensive support. This attracts institutions or organizations to look for alternatives that can reduce costs while bringing profits and productivity. Check out our previous blogs to learn about PostgreSQL’s ability to match SQL syntax with Oracle syntax.

Reason two: there are no licensing restrictions and the community is very large

For users of Oracle RDBMS platform, it is difficult to find any kind of community support that is free or not charged high fees. Organizations, organizations, and developers can often find alternative information online that can provide free answers to questions or solutions.

When using Oracle, it is difficult to decide whether to choose product technical support, because it usually involves more expenses. With PostgreSQL, the community is free, and there are many experienced experts who are willing to solve your current problems.

You can stay here https://lists.postgresql.org/ Subscribe to the mailing list to start contacting the community. PostgreSQL’s novice or genius can communicate here, show and share solutions, technologies, errors, new discoveries, and even share its emerging software. You can even useirc.freenode.netandJoin the PostgreSQL channel, ask for help from IRC chat. You can also join https://postgres-slack.herokuapp.com/ or https://postgresteam.slack.com/ Connect with the community through slack. There are many options, and many open source organizations can help you.

For more details on where to start, visit https://www.postgresql.org/community/ 。

If you’re willing to pay for better professional services in PostgreSQL, there are many options. Even on the website https://www.postgresql.org/support/professional_ You can also find a large number of companies, some of which are cheaper. Even at severalnines, we provide support for Postgres as part of the clustercontrol license or DBA technical consultation.

Reason 3: extensive support for SQL consistency

PostgreSQL has always been keen to adapt to and follow SQL language standards. The formal name of SQL standard is ISO / IEC 9075 “database language SQL”. Any subsequent revision of the standard release will replace the previous version, so there is little value in claiming consistency with the earlier version.

Unlike Oracle, there are still some keywords or operators that do not conform to the ANSI standard SQL (Structured Query Language). For example, beginners can easily confuse outer join (+) (Oracle operator syntax). PostgreSQL follows the ansi-sql standard for join syntax and has the advantage of easily jumping to other open source RDBMS databases, such as MySQL / percona / MariaDB.

Another syntax that is very common in Oracle is to use hierarchical queries. Oracle uses the nonstandard start with.. connect by syntax, while the SQL:1999 Hierarchical query is implemented by recursive common table expression (CTE). For example, compare the different representations of the following hierarchical queries:

Oracle
SELECT
restaurant\_name,
city\_name
FROM
restaurants rs
START WITHrs.city\_name = 'TOKYO'
CONNECT BY PRIOR rs.restaurant\_name = rs.city\_name;

PostgreSQL
WITH RECURSIVE tmp AS (SELECT restaurant\_name, city\_name
FROM restaurants
WHERE city\_name = 'TOKYO'
UNION
SELECT m.restaurant\_name, m.city\_name
FROM restaurants m
JOIN tmp ON tmp.restaurant\_name = m.city\_name)
SELECT restaurant\_name, city\_name FROM tmp;

PostgreSQL has a very similar approach to other top-level open source RDBMS, such as MySQL / MariaDB.

According to the PostgreSQL manual, PostgreSQL is developed to be consistent with the latest official version of the standard. In this case, such consistency does not contradict traditional functions or common sense. Many of the features required to support the SQL standard, although sometimes the syntax or functionality is slightly different. In fact, this is great for PostgreSQL, because PostgreSQL is supported and collaborated by different organizations anyway. The key is that its SQL language meets the requirements of the standard.

PostgreSQL was developed to be consistent with the latest official version of the standard, in which case this consistency does not conflict with traditional functionality or common sense. Many of the features required to support the SQL standard, although sometimes the syntax or functionality is slightly different. With the passage of time, further consistency is expected.

Reason 4: query parallelism

To be fair, compared with Oracle’s parallel execution of SQL statements, PostgreSQL’s query parallelism is not rich. The functions of Oracle parallelism include sentence queuing with prompts, setting the degree of parallelism (DOP), setting the parallelism strategy or the ability of adaptive parallelism.

PostgreSQL has a simple degree of parallelism according to the supported plans, but this does not define Oracle’s advantage in open source PostgreSQL.

The parallelism of PostgreSQL has been continuously improved and enhanced by the community. When PostgreSQL 10 was released, it enhanced its appeal to the public, especially the improvements in parallelism support such as merge join, bitmap heap scan, index scan and index only scan, and aggregate merge. PG_ stat_ Activity adds statistics.

In PostgreSQL V10, parallelism is disabled by default, and you need to set variables
“max_parallel_workers_per_gather
postgres=# \timing
Timing is on.
postgres=# explain analyze select * from imdb.movies wherebirthyear >= 1980 and birthyear <=2005;
QUERYPLAN
—————————————————————————————————————-
Seq Scan on movies  (cost=0.00..215677.28rows=41630 width=68) (actual time=0.013..522.520 rows=84473 loops=1)
Filter: ((birthyear >= 1980) AND (birthyear <=2005))
Rows Removed by Filter: 8241546
Planning time: 0.039 ms
Execution time: 525.195 ms
(5 rows)
Time: 525.582 ms
postgres=# \o /dev/null
postgres=#  select * from imdb.movies where birthyear>= 1980 and birthyear <=2005;
Time: 596.947 ms“
The query plan shows that its actual time can be about 522.5 MS, and then the actual query execution time is about 596.95 Ms.

Let’s look at the results of enabling parallelism
`postgres=# set max_parallel_workers_per_gather=2;
Time: 0.247 ms
postgres=# explain analyze select * from imdb.movies wherebirthyear >= 1980 and birthyear <=2005;
QUERYPLAN
——————————————————————————————————————————-
Gather  (cost=1000.00..147987.62 rows=41630width=68) (actual time=0.172..339.258 rows=84473 loops=1)
Workers Planned: 2
Workers Launched: 2
->  Parallel Seq Scan on movies (cost=0.00..142824.62 rows=17346 width=68) (actual time=0.029..264.980 rows=28158 loops=3)
Filter: ((birthyear >= 1980) AND (birthyear <=2005))
Rows Removed by Filter: 2747182
Planning time: 0.096 ms
Execution time: 342.735 ms
(8 rows)
Time: 343.142 ms
postgres=# \o /dev/null
postgres=#  select * from imdb.movies where birthyear>= 1980 and birthyear <=2005;
Time: 346.020 ms`
The query plan determines that the query requires parallelism, and then uses the gather node. Its actual time is estimated to be 339 MS (including 2 parallel worker processes) and 264 MS (before query plan summary). Now, the actual execution time of the query is 346ms, very close to the actual time estimated in the query plan.

This only illustrates the performance and benefits of PostgreSQL. Although PostgreSQL has its own limitations when using parallelism or query plans to determine that it is faster than using parallelism, its functionality does not differ significantly from Oracle. PostgreSQL’s parallelism is flexible, so long as your query matches the order required for query parallelism, you can enable or utilize it correctly.

Reason 5: Advanced JSON support and continuous improvement

Compared with other open source RDBMS, the JSON support in PostgreSQL is always at the same level. Take a look at the livejournal blog https://obartunov.livejournal… )The JSON support of PostgreSQL shows that it is always more advanced than other RDBMS. PostgreSQL has a large number of JSON functions and functions.

The JSON data type was introduced in postgresql-9.2. Since then, it has made a number of significant improvements, as well as major improvements in postgresql-9.4, adding the data type of jsonb. PostgreSQL provides two data types for storing JSON data: JSON and jsonb. For jsonb, it is an advanced version of the JSON data type, which stores JSON data in binary format. This is a major enhancement, which is quite different from the way you search and process JSON data in PostgreSQL.

Oracle also widely supports JSON. In contrast, PostgreSQL provides a wide range of support, which can be used for data retrieval, data formatting or conditional operations, which can affect the output of data, and even affect the data stored in the database. The data stored with jsonb data type has a great advantage, that is, it can use gin (general inverted index). Gin can be used to effectively search for key or key / value pairs in a large number of jsonb documents.

PostgreSQL has additional extensions to help convert the type of jsonb to the process language it supports. These extensions are jsonb for PL / Perl_ Plperl and jsonb_ plperlu。 For PL / python, they are jsonb_ plpythonu、jsonb_ Plpython2u and jsonb_ plpython3u。 For example, if you use jsonb values to map Perl arrays, you can use jsonb_ Plperl or jsonb_ Plperlu extension.

Arangodb has released a benchmark that compares the JSON performance of PostgreSQL with other JSON enabled databases. Although this is an older blog, it still shows the performance of PostgreSQL’s JSON over other databases. In other databases, JSON is the core function of its kernel. This shows that JSON has certain advantages even if it is only an additional function of PostgreSQL.

Reason 6: support of dbaas by major cloud providers

PostgreSQL has been widely supported as dbaas. These services come from Amazon, Microsoft azure database and Google cloud SQL.

In contrast, Oracle is only available on Amazon RDS for Oracle. The services provided by major participants start at affordable prices and can be flexibly set up according to your needs. This helps institutions and organizations to set up accordingly and reduce the huge cost of bundling on the Oracle platform.

Reason 7: better handling of massive data

PostgreSQL RDBMS is not suitable for working with workload of analysis and data warehouse classes. PostgreSQL is a row oriented database, but it has the ability to store large amounts of data. PostgreSQL has the following limitations when dealing with data stores:
Top 10 reasons to migrate from Oracle to PostgreSQL

The main advantage of PostgreSQL is that there are already plug-ins that can be merged to handle large amounts of data. Cstore of timescaledb and citusdata_ FDW can be merged into time series database to store a large amount of data in mobile and Internet of things applications as one of the plug-ins for data analysis or data warehouse. In fact, clustercontrol (a product) provides support for timescaledb, which is simple and easy to deploy.

If you want to use the core features of PostgreSQL, you can use jsonb to store large amounts of data. For example, a large number of documents (PDF, word, spreadsheet) can be stored using the jsonb data type. For geolocation applications or systems, you can use PostGIS.

Reason 8: cheap scalability, high availability, redundancy / geographic redundancy and fault tolerant solutions

Oracle provides similar solutions, such as Oracle grid, Oracle Real Application Clusters (RAC), Oracle clusterware and Oracle data guard. These technologies can increase your costs, and deployment and maintenance costs are surprisingly expensive, and once used, it’s hard to abandon these solutions. Technical personnel must be trained to enhance their skills and personnel involved in the deployment and implementation process.

PostgreSQL has a lot of support and a lot of choices. PostgreSQL contains streams and logical copies from the built-in core package. You can also set up synchronous replication for PostgreSQL to have more highly available clusters while allowing the standby node to process read queries. For high availability, we recommend that you read our blog PostgreSQL top PG cluster high availability (HA) solution, which covers many good tools and technologies for you to choose from.

There are also some enterprise features that provide high availability, monitoring, and backup solutions. Clustercontrol is one of the technologies, and it’s affordable compared to Oracle solutions.

Reason 9: support multiple process languages: PL / PgSQL, PL / TCL, PL / PerlandPL/Python

Starting with version 9.4, PostgreSQL has a great feature that allows you to define new process languages of your choice. Although not all programming languages are supported, it has many supported languages. Currently, through the basic distribution, it includes PL / PgSQL, PL / TCL, PL / Perl, and PL / python. The external language is:
Top 10 reasons to migrate from Oracle to PostgreSQL

The benefit of this is that, unlike Oracle, new PostgreSQL developers can quickly provide business logic for their application systems without spending more time learning PL / SQL. PostgreSQL makes the developer’s environment easier and more efficient. This nature of PostgreSQL helps developers like PostgreSQL and start moving enterprise platform solutions to open source environments.

Reason 10: flexible indexing of large data sets and text data (Gin, gist, sp-gist and Brin)

PostgreSQL has a huge advantage in supporting indexing, which is conducive to processing big data. Oracle has many index types, and they are also good for dealing with large datasets, especially full-text indexes. But for PostgreSQL, these types of indexes are set flexibly for your purposes. For example, these types of indexes are suitable for big data:

Gin – (generalized inverted index)

This kind of index is suitable for jsonb, hsstore, range and arrays data type columns. This is useful when your data type contains multiple values in a single column. According to the PostgreSQL document, “gin is used to handle the case that the item to be indexed is a compound value, and to search for element values that appear within the compound item through the query processed by the index. For example, these items can be documents, and queries can be searches for documents that contain specific words. “

Gist – (general search tree)

The height of the page consists of a balanced search tree. A node consists of index rows. In general, each row (leaf row) of a leaf node contains predicates (Boolean expressions) and references to table rows (TIDs). Gist indexes are best used for geometric data types, such as looking at whether two polygons contain a point. In one case, a particular point may be contained in a box, while another point exists only in a polygon. When dealing with full-text search, the most common data types to use gist indexes are geometric types and text

When choosing which index type (GIST or gin) to use, consider the following performance differences:

  • Gin index lookup is three times faster than gist
  • Gin indexes take three times longer to build than gist
  • Gin indexes are slightly slower to update than gist indexes, but are about 10 times slower if fast update support is disabled
  • The gin index is two to three times larger than the gist index

As a rule of thumb, gin indexes are best suited for static data because of faster lookup times. For dynamic data, gist indexes are updated faster.

SP gist – (space partition GIST)

For larger datasets with natural but heterogeneous clusters. This type of index makes use of the spatial partition tree. SP gist indexes are most useful when your data has a natural cluster element and is not a balanced tree. A good example is phone numbers, such as in the United States, where they use the following format:

  • 3-digit area code
  • Prefix with 3 digits (related to telephone operator’s switch)
  • 4 digits of line number

This means that you have some natural clustering around the first set of three digits, and the second set of three digits, and then the numbers may spread out more evenly. However, due to the presence of phone numbers, some area codes are much more saturated than others. The result may be that the trees are very unbalanced. Due to natural clustering and uneven distribution of data, data such as telephone numbers can be a good example of spgist.

Brin – (block wide index)

For large data sets arranged in order. A block range is a set of adjacent pages where summary information for all of these pages is stored in an index. Block range indexes can focus on some use cases similar to sp gist. When sorting data naturally, the data is often very large, and it is best to use them. For example, with a billion record tables, especially time series data, brin may provide effective help. If you want to query large amounts of data that are naturally grouped together, such as data from several zip codes (and then aggregated into a city), brin helps ensure that similar postcodes are close to each other on disk.

When you have a very large data set, such as a date or zip code, the brin index allows you to quickly skip or exclude a lot of unnecessary data. In addition, brin is maintained as a smaller index than the overall data size, which gives them an advantage when dealing with large datasets.

conclusion

PostgreSQL has some major advantages when competing with Oracle’s enterprise platforms and business solutions. PostgreSQL is absolutely reliable as the first choice for open source RDBMS because it is almost as powerful as Oracle.

Oracle is hard to beat (which is a hard fact to accept), and it is not easy to abandon the enterprise platform of technology giants. This can be a challenge when the system provides you with capabilities and production results.

But sometimes it is necessary to make a choice in some cases, because the continuous over investment in the platform may exceed the cost of other business layer priorities, thus affecting the schedule.

You can choose PostgreSQL and its underlying platform solutions to help you reduce costs and reduce budget problems; and all the changes are not big.

For more wonderful content, please pay attention to the following platforms and websites:

Official account of China Postgre SQL chapter (technical articles and technical activities):
PostgreSQL branch of Open Source Software Alliance

China postgre SQL branch technical Q & a community:
www.pgfans.cn

Official website of China postgre SQL branch:
www.postgresqlchina.com

Extra!!! Autumn solicitation order — send off spring and summer and usher in golden autumn. In this golden autumn season, the open source software alliance PostgreSQL branch ushered in a new batch of contributions. The main content of the solicitation is closely related to our PostgreSQL. The subject matter of the article is unlimited, and the best is a series of articles. Welcome the majority of technical personnel to continue to contribute!
Submission format: word pdf
Submission email: [email protected]
We will hold PGcon activities recently. We have been invited to many large coffee shops. Please pay attention to our official account number and hear further information.