Java interview database


What are the advantages of SQL optimization? How to create an index? How to create index? What are the advantages and disadvantages of indexing?

SQL optimization includes:
Try to avoid using select *, return useless fields will reduce efficiency. Optimization method: only specific fields can be used instead of select specific fields, and only used fields can be returned.
Avoid using in and not in as much as possible, which will cause the database engine to abandon the index for full table scanning. Optimization method: if it is a continuous value, it can be replaced by between; if it is a subquery, it can be replaced by exists.
Try to avoid fuzzy query at the beginning of the field, which will cause the database engine to give up the index and scan the whole table. Optimization method: try to use fuzzy query after the field.
Try to avoid the judgment of null value, which will cause the database engine to give up the index for full table scanning. Optimization method: you can add a default value of 0 to the field and judge the value of 0.
How to create an index:
Database index is a sort of data structure in database management system. The implementation of index usually uses B tree and its variant B + tree.
There are two ways to create an index:
Create an index directly, such as using the create index statement or using the create index wizard.
Indirectly create an index, for example, when you define a primary key constraint or a unique key constraint in a table, you also create an index.
Function of index:
Help to quickly query and update data in database table.
Setting an index for a table costs:
One is to increase the storage space of the database; Second, it takes a lot of time to insert and modify data (because the index also changes with it).
Creating an index can greatly improve the performance of the system (advantages)
By creating a unique index, the uniqueness of each row of data in the database table can be guaranteed.
It can greatly speed up the speed of data retrieval, which is also the main reason for creating an index.
It can speed up the connection between tables, especially in the realization of data referential integrity.
When grouping and sorting clauses are used for data retrieval, the time of grouping and sorting can also be significantly reduced.
By using the index, we can use the optimized hider in the process of query to improve the performance of the system.
There are also many disadvantages (disadvantages) in increasing the index:
It takes time to create and maintain indexes, which increases with the increase of data volume.
Indexes need to take up physical space. In addition to data tables, each index needs to take up a certain amount of physical space. If you want to build a clustered index, you need more space.
When the data in the table is added, deleted and modified, the index should be maintained dynamically, which reduces the speed of data maintenance.
The principle of index creation:
The leftmost prefix matching principle (matching to the right until the range query is met);
=And in can be out of order (the index can be built in any order, and the query optimizer of MySQL will help you optimize the index into a recognizable form);
Try to select the column with high discrimination as the index (the formula of discrimination is count (distinct Col) / count (*), which indicates the proportion of non duplicate fields. The larger the proportion is, the fewer records we scan, and the discrimination of the unique key is 1);
Index column cannot participate in calculation (keep column “clean”,);
Try to expand the index, do not create a new index (for example, a – > (a, b) only need to modify the original index);
Select the unique index (the value of the unique index is unique, which can be used to determine a record more quickly.);
Index the fields that often need to be sorted, grouped and combined;
Index the fields that are often used as query criteria;
Limit the number of indexes;
Try to use index with less data;
Try to use prefix to index;
Delete indexes that are no longer or rarely used.

What is the execution order of SQL keywords?

From clause to assemble data from different data sources;
Where clause to filter records based on specified conditions
The group by clause divides data into groups
Using aggregate functions for calculation
Using the having clause to filter groups
Evaluate all expressions
Use order by to sort the result set
That is: from – > where – > group by – > having – > calculate all expressions – > order by – > select output

How to remove duplicate SQL?

The general idea is to find out one of the duplicate data in the table, insert it into the temporary table, delete all the duplicate data, and then insert the data in the temporary table into the table.
Duplicate data is exactly the same, using distinct;
ID column is different, ID type is int, self increasing field, using aggregate function max or other;
The ID column is different, and the ID type is uniqueidentifier; Use row_ Number () over () and partition by add line numbers to each group; Insert data with row number = 1 into the temporary table.

The difference between internal connection and external connection

Natural join: it is a special kind of equivalent join. It requires that the same attribute column be compared in two relation tables. There is no need to add join conditions, and duplicate attribute columns are eliminated in the results.
SQL statement: Select… From table 1, natural join table 2
Inner join: it is basically the same as natural join. The difference is that natural join requires the comparison of attribute columns with the same name, while inner join does not require two attribute columns with the same name. You can use using or on to specify the same join condition of a certain two column field.
SQL statement: Select… From table 1 inner join table 2 on table 1. A = Table 2. E
Left outer join: it is a natural join between two tables. All the data rows in the left table are reserved, and the rows that meet the join conditions are reserved in the right table.
SQL statement: Select… From table 1 left outer join table 2 on table 1. C = Table 2. C
Right outer join: it is a natural join between two tables. All the data rows in the right table are reserved, and the rows that meet the join conditions are reserved in the left table.
Select… From table 1 right outer join table 2 on table 1. C = Table 2. C

How to use redis in Java? What data types does redis support and how to use them? How does redis solve the problem of data expiration?

How to use redis:
The installation of redis (Windows); Start redis, the default port is 6379; Connect to redis and enter redis-cli.exe – H – P 6379
Using jedis to connect redis in Java
Spring integrated redis:
Spring-data-redis.jar package is introduced; configuration file;
Spring redis configuration file (this includes the configuration of database and mybatis);
Redis tool class (here try catch is to prevent the problem of redis downtime, and the method can continue to execute);
Include spring-redis.xml into web.xml;
Inject redisutil into the business class that needs redis.
Data types supported by redis and usage scenarios of various data types:
Java interview database
How to solve the problem of data expiration in redis
Redis provides two ways to delete expired data:
Delete periodically: redis defaults to 100ms, then extracts some keys with expiration time, and delete them after expiration. The advantage is to avoid scanning expired keys for a long time. The disadvantage is that some expired keys cannot be deleted.
Lazy deletion: if an expired key is queried but not deleted periodically, it is deleted. The key returns normally without expiration.

What are the design considerations of database table? Understanding of the three paradigms?

Points for attention in database design:
Field atomicity: ensure the atomicity of each column, non decomposable, can use a field to express clearly, never use the second field.
Primary key design: the primary key should not be associated with business logic. It’s better to be a meaningless string of independent and non repetitive numbers.
Field usage times: for frequently modified fields (generally referred to as status fields), it is better to use independent numbers or single letters instead of Chinese characters or long characters in English.
Field length: when creating a table, the field length should be about 3-5 fields larger than the actual business field, and the best value is the n-th power of 2.
About foreign keys: try not to create foreign keys, to ensure the independence of each table.
Dynamic and static separation: it is best to do a good job of static table and dynamic table separation.
About the value of code: use numeric code or letters to replace the actual name, that is, try to convert the name to code.
On the value of null: try not to have a null value, if there is a null value, the query time of the database in the index is longer, thus wasting more time! You can set a default value when creating a table!
On the choice of engine: MyISAM’s actual query speed is faster than InnoDB, because it does not scan the whole table, but MyISAM does not support transactions, so it can’t guarantee the acid of data.
Resource storage: the database should not store any resource files.
Related to primary key: according to the three paradigms of database design, try to ensure that column data and primary key are directly related rather than indirectly related
Relationship mapping: many to one or one to many relationships. It’s best to establish a relationship by ID to associate a table.
Reserved field: when designing a table, a blank field should be prefabricated for future expansion.
Leave a single field to determine whether the table is available: use a single field to control whether the table is available.
Delete field: the database is prohibited to use the delete command, generally will not really delete the data, are used to change the state of the way, set the state field, by modifying the state to give it valid logical meaning!
Understanding of the three paradigms:
The first normal form (1NF): ensure the atomicity of each column
Each column in a data table is the smallest indivisible cell
Second normal form (2NF): records in a table are unique
The data in the table can be distinguished by primary key
Third normal form (3NF): data in tables should not be redundant
In a table, do not have other fields except key fields (primary key) in other tables, and associate them with foreign keys

How to understand and use stored procedures?

Stored procedure: one or more SQL commands stored in the database as executable objects. Generally speaking: a stored procedure is actually a set of SQL statements that can complete certain operations.
Stored procedures can be encapsulated and hide complex business logic.
Stored procedures can return values and accept parameters.
A stored procedure cannot be run with the select instruction because it is a subroutine, unlike a view table, data table, or user-defined function.
Stored procedures can be used in data inspection, enforcing business logic, etc.
Stored procedures are often customized to specific databases because of the different programming languages they support. When switching to the database system of other manufacturers, the original stored procedure needs to be rewritten.
The performance tuning and writing of stored procedures are limited by various database systems.
The created stored procedure is saved in the data dictionary of the database.
Query the status of all stored procedures
View the status of all stored procedures in the corresponding database
MySQL stored procedure uses call and procedure name as well as a bracket, in which parameters are added as needed, including input parameters, output parameters, input and output parameter calls.

How to realize pagination in database?

SQL Server
In paging query, I feel that SQL server is more laborious, there is no special paging statement, relying on a clever way to achieve paging query.
MySQL has a statement limit for querying a piece of data, which is very convenient to use.
There is a rownum in Oracle, whose meaning is more obvious, that is, the meaning of the first line, so that we can carry out segmented query by where condition.

How to optimize the paging query of millions of data?

–Method 1: directly use the SQL statements provided by the database
–Statement style: in mysql, the following methods can be used: select * from table name limit m, n
–Adapt to the scene: suitable for the case of small amount of data (tuple hundred / thousand level)
–Cause / disadvantage: full table scanning is slow, and the return of some database result sets is unstable (for example, 1,2,3 is returned once, and 2,1,3 is returned once). The limit limit is to take n outputs from the M position of the result set, and discard the rest
–Method 2: create a primary key or unique index, and use the index (assuming 10 items per page)
–Statement style:   In mysql, you can use the following method: SelectFrom table name where id_ pk > (pageNum10) LIMIT M
–Adaptive scenario: suitable for large amount of data (tens of thousands of tuples)
–Reason: index scanning, speed will be very fast. A friend: because the data query is not in accordance with PK_ ID sorting, so there will be missing data, only method 3
–Method 3: index based reordering
–Statement style: in mysql, you can use the following method: SelectFrom table name where id_ pk > (pageNum10) ORDER BY id_pk ASC LIMIT M
–Adaptive scenario: it is applicable to the case of large amount of data (tens of thousands of tuples). It is best that the column object after order by is the primary key or unique, so that the order by operation can be eliminated by using the index, but the result set is stable (see method 1 for the meaning of stability)
–Reason: index scanning, speed will be very fast. But MySQL sort operation, only ASC, no desc (desc is false, the future will do real DESC, looking forward to…)
–Method 4: Based on the index, use prepare (the first question mark represents pagenum, the second? Represents the number of tuples per page)
–Statement style: in mysql, you can use the following method: prepare stmt_ name FROM SELECTFrom table name where id_ pk > (? ?) ORDER BY id_pk ASC LIMIT M
–Adapt to the scene:   Big data
–Reason: index scanning is very fast. Prepare statement is a little faster than general query statement.
–Method 5: using Mysql to support order operation, we can use index to quickly locate partial tuples and avoid full table scanning
For example, read the tuple from line 1000 to line 1019 (PK is the primary key / unique key)
SELECT * FROM your_table WHERE pk>=1000 ORDER BY pk ASC LIMIT 0,20
–Method 6: use “subquery / join + index” to locate the tuple quickly, and then read the tuple
For example, if Id is the primary key / unique key and blue font is used as the variable:

How to understand and use optimistic lock and pessimistic lock of database?

Pessimistic lock is to take a pessimistic attitude to the data conflict, that is, assuming that the data will definitely conflict, so lock the data when the data begins to read( Data lock: data will not be modified temporarily)
Optimistic lock, that data generally will not cause conflict, so when the data is submitted for update, it will formally detect whether the data conflict or not. If a conflict is found, it will let the user return the wrong information. Let the user decide how to do it.
Pessimistic lock usually relies on the lock mechanism provided by the database, such as MySQL exclusive lock, select… For update to achieve pessimistic lock.
Optimistic lock does not rely on the lock mechanism provided by the database, but needs to be implemented by ourselves. Generally, the implementation method is to record the data version, one is through the version number, the other is through the timestamp.

The conversion between string and date in database?

Time to string_ char(date,format)
select to_ Char (sysdata, ‘yyyy’ year ‘mm’ month ‘DD’ Day ‘) time to string from dual
String to time_ date(str,format)
select to_date(‘2019-10-25 17:15:20′,’yyyy-MM-dd HH24:mi:ss’)
MySQL built-in function, using STR in MySQL_ to_ Date () converts a string to a date
Example: the separator should be consistent, and the date should be consistent
String to date
select str_to_date(‘2019-10-25 15:43:28′,’%Y-%m-%d %H:%i:%s’);
Date to string
select DATE_ Format (sysdate(),% mmdd, yyyy ‘) MySQL date to string from dual;

What’s the difference between union and unionall?

The difference between union and union all is that union will automatically compress the duplicate results in multiple result sets, while union all will display all the results, no matter whether they are duplicate or not.
Union: Union of two result sets, excluding duplicate rows, and sort by default rules— After table linking, duplicate records will be filtered out, so after table linking, the resulting result set will be sorted, the duplicate records will be deleted, and then the results will be returned
Union all: Union of two result sets, including duplicate rows, without sorting— If there are duplicate data in the two returned result sets, the returned result set will contain duplicate data.

What are the storage engines of MySQL?


MySQL 5.6 is the default storage engine. InnoDB is a transaction secure storage engine, which has the functions of commit, rollback and crash recovery to protect user data. InnoDB’s row level locking and Oracle style consistent lock free reading improve its multi-user concurrency and performance. InnoDB stores user data in a clustered index to reduce the I / O overhead of common queries based on primary keys. In order to ensure data integrity, InnoDB also supports foreign key constraints.

MyISAM does not support transactions or foreign keys, and its advantage is fast access speed. However, table level locking limits its performance in read-write load, so it is often used in read-only or read-only data scenarios.

All data is stored in memory, which is applied to the scene where non critical data is quickly searched. Memory type tables access data very fast, because its data is stored in memory, and the hash index is used by default, but once the service is shut down, the data in the table will be lost

Black hole storage engine, similar to Unix / dev / null, archive only receives but does not save data. Queries on tables of this engine often return an empty set. This kind of table can be used in the master-slave configuration where DML statements need to be sent to the slave server, but the master server does not keep the backup of the data.

Its table is really a comma separated text file. The CSV table allows you to import and export data in CSV format, and interact with scripts and applications in the same read and write format. Since the CSV table has no index, you’d better put the data in the InnoDB table in normal operation, and only use the CSV table in the import or export stage.

(also known as ndbcluster) – this cluster data engine is particularly suitable for applications that require the highest degree of uptime and availability. Note: NDB storage engine is not supported in standard MySQL 5.6. We can support it now

The versions of MySQL Cluster include: MySQL Cluster NDB 7.1 based on MySQL 5.1; MySQL Cluster NDB 7.2 based on MySQL 5.5; MySQL Cluster NDB 7.3 based on MySQL 5.6. MySQL Cluster NDB 7.4, which is also based on MySQL 5.6, is currently in the research and development stage.

Allows MySQL DBAs or developers to group a series of identical MyISAM tables and refer to them as an object. It is suitable for large-scale data scenarios, such as data warehouse.

It provides the ability to connect different MySQL servers from multiple physical machines to create a logical database. It is suitable for distributed or data market scenarios.

This storage engine is used to save examples of how to start writing MySQL source code for a new storage engine. It’s mainly for interested developers. This kind of storage engine is a “stub” that does nothing. You can use this engine to create tables, but you cannot save any data to them or retrieve any indexes from them.

What are the transaction isolation levels? What is the default isolation level between MySQL and oracle?

Java interview database
Four characteristics of transaction
Atomicity: transaction is the smallest execution unit, and segmentation is not allowed. The atomicity of a transaction ensures that the action is either complete or not functional at all.
Consistency: before and after executing a transaction, the data is consistent, and the result of multiple transactions reading the same data is the same;
Isolation: when accessing the database concurrently, a user’s transaction is not interfered by other transactions, and the database is independent among the concurrent transactions;
Persistence: after a transaction is committed. The change of the data in the database is persistent, even if the database fails, it should not have any impact on it.
Dirty reading? Fantasy reading? Not repeatable?
Dirty read: a transaction has updated a piece of data, and another transaction has read the same piece of data at this time. For some reasons, if the previous rollback has operated, the data read by the latter transaction will be incorrect.
Phantom read: in two queries of a transaction, the number of data is inconsistent. For example, one transaction queries several rows of data, while another transaction inserts new columns of data at this time. In the next query, the previous transaction will find several columns of data that it did not have before.
Non repeatable read: the data in two queries of a transaction is inconsistent, which may be due to the insertion of the original data updated by a transaction in the process of two queries.

How to convert row to column and column to row in SQL?

Row to column:
Case when and sum (max)
Column to row:
Use union, case when and concat to complete

How to view SQL execution plan?

set statistics profile on

What are the analysis functions in oracle?

Analysis function is a powerful function used by Oracle to solve the statistical requirements of complex reports. It can group data and then calculate some statistical value based on group, and each row of each group can return a statistical value.
The analysis function has a window function over (), which contains three analysis clauses: partition by, order by, and rows  ;
first_ Value () and last_ Value (): find other attributes corresponding to the maximum value
rank(),dense_ Rank () and row_ Number (): sort
Lag() and lead(): find the n-th line before or after
Rollup() and cube(): permutation, combination and grouping
Max(), min(), sun(), and avg(): find the sum and average of the most moving values

In addition to aggregate functions, what other functions are commonly used in database? How to use the merge() function of Oracle database?

Java interview database
Java interview database
Java interview database
Java interview database
Function and use of Oracle database merge():
Usually when we insert database data, we will judge whether the data already exists. If it exists, we will modify it. If it does not exist, we will insert it. Generally, we will write two SQL, one insert and one update. In fact, there is a merge function in Oracle, which can solve this problem at one time

What is the difference between drop, truncate and delete in SQL?

Java interview database

How to ignore the case of table names in MySQL?

After MySQL is installed in Windows system, it is insensitive to the case of table name by default.
Under Linux, some systems need to be set manually. Log in as root, open and modify / etc / my.cnf; Under the [mysqld] node, add a line: lower_ case_ table_ names=1。 Restart MySQL service systemctl restart mysqld

What’s the difference between having and where?

The where filter clause is the filter condition for the data in the table in the process of query. Aggregate function is not allowed to be used as the filter condition because of the wrong time. Aggregate function is used for statistical processing of the result set after data query in the table, and the execution time of the two functions is inconsistent.
The existing filter clause is used to filter the query result set. You can use the aggregate function after the clause. Because the time is the same, the aggregate function is used to process the result set, and the existing clause is used to filter the result set, and can be used together. In addition, having cannot be used alone, but can only be used after the group by clause.

Function and use of cursor?

Cursor is actually a mechanism that can extract one record at a time from the result set containing multiple data records. SQL cursor is a kind of temporary database object, which can be used to store the copy of data row in the database table, and can also point to the pointer of data row stored in the database. Cursors provide a way to manipulate data in a table on a row by row basis.  
A common use of cursors is to save query results for later use.
The result set of a cursor is generated by the select statement. If a record set needs to be reused in the process, it is much faster to create a cursor and reuse it several times than to repeatedly query the database.
%Found — judge whether there is any data in the cursor. If there is, return true; otherwise, return false.
%Notfound — opposite to% found
%Isopen — determine whether the cursor is open
%Rowcount — records the number of records that have been removed from the cursor

How to use timer in database? Trigger? Scheduled tasks?

How to implement recursive query in oracle?

Start with condition one connect by prior condition two where condition three
The first method: start with child node id = ‘…’connect by prior child node id = parent node ID — recursively query the condition (including yourself) and its child nodes according to the condition, and the query result is all the descendant nodes (including yourself)
Second: start with child node id = ‘…’connect by child node id = prior parent node ID — recursively query the condition (including yourself) and its parent node according to the condition, and the query result is all the previous generation nodes (including yourself)
The third method: start with parent node id = ‘…’connect by prior child node id = parent node ID — recursively query the child nodes according to the condition (excluding yourself), and the query result is all the descendant nodes (excluding yourself)
Fourth: start with parent node id = ‘…’connect by child node id = primary parent node ID — recursively query the first generation of children and their parents according to the condition (including themselves), and the query result is their first generation after node, and all the previous generation nodes (including themselves)
If there is a where condition, (4) the execution order is start with connect by prior, and then filter according to the where condition

How to ensure the security of modified data under high level parallel development

Using synchronized to solve the problem, adding synchronous code block to the generated ID code, the problem is solved successfully;
The function is: at the same time, only one thread can execute the code block
Use lock lock to solve the problem: add lock lock to the code generating ID to solve the problem successfully;
Pessimistic lock — when modifying data, it adopts the lock state, and rejects the modification of external requests. When encountering the lock state, it must wait
Disadvantages: in the case of high concurrency, every request needs to wait for a ‘lock’. Some threads may never have a chance to seize the lock, and the request will die there. There will be a lot of such requests, which instantly increases the average response time of the system. As a result, the number of available links is exhausted, and the system falls into an exception
FIFO queue: FIFO (first input, first output, first in first out) is adopted, so that some requests will not be locked forever
Disadvantages: there are many requests. It is very likely that the queue memory will “burst” instantly, and the system will fall into an abnormal state, or a huge memory queue will be designed. However, when the system finishes processing a queue, the speed of requests in the queue can’t be compared with the number of requests pouring into the queue crazily. That is to say, the more requests in the queue will accumulate, and finally the average response time of the web system will drop significantly, The system is still stuck in an anomaly.
Optimistic lock — compared with pessimistic lock, optimistic lock adopts a more relaxed locking mechanism, which mostly adopts update with version number. The implementation is that all requests for this data are qualified to modify, but they will get a version number of the data. Only if the version number matches can the update be successful, and other returns fail to snap up. In this way, the queue problem is not considered, and the CPU overhead will be increased,
Java interview database

How to realize primary key auto increment in oracle?

Oracle does not have this “auto”_ Because of the “increment” attribute, it can’t define a self incrementing primary key in a table like mysql. However, sequence in Oracle can indirectly realize the function of self increasing primary key. You can also create triggers to achieve self incrementing.
Sequence, also known as sequence generator, is used to provide a series of numbers. Developers use the sequence to generate unique keys. Each visit sequence, the sequence increases or decreases according to a certain rule.
The definition of a sequence is stored in the system table space. Unlike a table, a sequence does not occupy disk space.
The sequence is independent of the transaction, and each transaction commit and rollback will not affect the sequence.

How to restore deleted data without backup?

Flashback query
–Try to use the flash back query feature after Oracle 10g. Flash back query can query data several times ago
LOGMNR log mining
–Use logminer log mining to dig out the redo of delete to see if the corresponding undo rollback SQL is available
Using Oracle prm-dul tool
–The Oracle prm-dul tool can recover the deleted records on the tables in the database.

How to deal with Oracle deadlock?

Java interview database

Redis data persistence?

Java interview database
Java interview database