Welcome to Knowledge Base!

KB at your finger tips

This is one stop global knowledge base where you can learn about all the products, solutions and support features.

Categories
All
Database-PostgreSQL
70.1. Introduction

70.1. Introduction

GIN stands for Generalized Inverted Index. GIN is designed for handling cases where the items to be indexed are composite values, and the queries to be handled by the index need to search for element values that appear within the composite items. For example, the items could be documents, and the queries could be searches for documents containing specific words.

We use the word item to refer to a composite value that is to be indexed, and the word key to refer to an element value. GIN always stores and searches for keys, not item values per se.

A GIN index stores a set of (key, posting list) pairs, where a posting list is a set of row IDs in which the key occurs. The same row ID can appear in multiple posting lists, since an item can contain more than one key. Each key value is stored only once, so a GIN index is very compact for cases where the same key appears many times.

GIN is generalized in the sense that the GIN access method code does not need to know the specific operations that it accelerates. Instead, it uses custom strategies defined for particular data types. The strategy defines how keys are extracted from indexed items and query conditions, and how to determine whether a row that contains some of the key values in a query actually satisfies the query.

One advantage of GIN is that it allows the development of custom data types with the appropriate access methods, by an expert in the domain of the data type, rather than a database expert. This is much the same advantage as using GiST .

The GIN implementation in PostgreSQL is primarily maintained by Teodor Sigaev and Oleg Bartunov. There is more information about GIN on their website.

70.1. Introduction

70.1. Introduction

GIN stands for Generalized Inverted Index. GIN is designed for handling cases where the items to be indexed are composite values, and the queries to be handled by the index need to search for element values that appear within the composite items. For example, the items could be documents, and the queries could be searches for documents containing specific words.

We use the word item to refer to a composite value that is to be indexed, and the word key to refer to an element value. GIN always stores and searches for keys, not item values per se.

A GIN index stores a set of (key, posting list) pairs, where a posting list is a set of row IDs in which the key occurs. The same row ID can appear in multiple posting lists, since an item can contain more than one key. Each key value is stored only once, so a GIN index is very compact for cases where the same key appears many times.

GIN is generalized in the sense that the GIN access method code does not need to know the specific operations that it accelerates. Instead, it uses custom strategies defined for particular data types. The strategy defines how keys are extracted from indexed items and query conditions, and how to determine whether a row that contains some of the key values in a query actually satisfies the query.

One advantage of GIN is that it allows the development of custom data types with the appropriate access methods, by an expert in the domain of the data type, rather than a database expert. This is much the same advantage as using GiST .

The GIN implementation in PostgreSQL is primarily maintained by Teodor Sigaev and Oleg Bartunov. There is more information about GIN on their website.

Read article
Chapter 15. Parallel Query

Chapter 15. Parallel Query

Table of Contents

15.1. How Parallel Query Works
15.2. When Can Parallel Query Be Used?
15.3. Parallel Plans
15.3.1. Parallel Scans
15.3.2. Parallel Joins
15.3.3. Parallel Aggregation
15.3.4. Parallel Append
15.3.5. Parallel Plan Tips
15.4. Parallel Safety
15.4.1. Parallel Labeling for Functions and Aggregates

PostgreSQL can devise query plans that can leverage multiple CPUs in order to answer queries faster. This feature is known as parallel query. Many queries cannot benefit from parallel query, either due to limitations of the current implementation or because there is no imaginable query plan that is any faster than the serial query plan. However, for queries that can benefit, the speedup from parallel query is often very significant. Many queries can run more than twice as fast when using parallel query, and some queries can run four times faster or even more. Queries that touch a large amount of data but return only a few rows to the user will typically benefit most. This chapter explains some details of how parallel query works and in which situations it can be used so that users who wish to make use of it can understand what to expect.

Read article
Chapter 15. Parallel Query

Chapter 15. Parallel Query

Table of Contents

15.1. How Parallel Query Works
15.2. When Can Parallel Query Be Used?
15.3. Parallel Plans
15.3.1. Parallel Scans
15.3.2. Parallel Joins
15.3.3. Parallel Aggregation
15.3.4. Parallel Append
15.3.5. Parallel Plan Tips
15.4. Parallel Safety
15.4.1. Parallel Labeling for Functions and Aggregates

PostgreSQL can devise query plans that can leverage multiple CPUs in order to answer queries faster. This feature is known as parallel query. Many queries cannot benefit from parallel query, either due to limitations of the current implementation or because there is no imaginable query plan that is any faster than the serial query plan. However, for queries that can benefit, the speedup from parallel query is often very significant. Many queries can run more than twice as fast when using parallel query, and some queries can run four times faster or even more. Queries that touch a large amount of data but return only a few rows to the user will typically benefit most. This chapter explains some details of how parallel query works and in which situations it can be used so that users who wish to make use of it can understand what to expect.

Read article
53.35. pg_opfamily

53.35. pg_opfamily

The catalog pg_opfamily defines operator families. Each operator family is a collection of operators and associated support routines that implement the semantics specified for a particular index access method. Furthermore, the operators in a family are all compatible , in a way that is specified by the access method. The operator family concept allows cross-data-type operators to be used with indexes and to be reasoned about using knowledge of access method semantics.

Operator families are described at length in Section 38.16.

Table 53.35. pg_opfamily Columns

Column Type

Description

oid oid

Row identifier

opfmethod oid (references pg_am . oid )

Index access method operator family is for

opfname name

Name of this operator family

opfnamespace oid (references pg_namespace . oid )

Namespace of this operator family

opfowner oid (references pg_authid . oid )

Owner of the operator family


The majority of the information defining an operator family is not in its pg_opfamily row, but in the associated rows in pg_amop , pg_amproc , and pg_opclass .

Read article
53.35. pg_opfamily

53.35. pg_opfamily

The catalog pg_opfamily defines operator families. Each operator family is a collection of operators and associated support routines that implement the semantics specified for a particular index access method. Furthermore, the operators in a family are all compatible , in a way that is specified by the access method. The operator family concept allows cross-data-type operators to be used with indexes and to be reasoned about using knowledge of access method semantics.

Operator families are described at length in Section 38.16.

Table 53.35. pg_opfamily Columns

Column Type

Description

oid oid

Row identifier

opfmethod oid (references pg_am . oid )

Index access method operator family is for

opfname name

Name of this operator family

opfnamespace oid (references pg_namespace . oid )

Namespace of this operator family

opfowner oid (references pg_authid . oid )

Owner of the operator family


The majority of the information defining an operator family is not in its pg_opfamily row, but in the associated rows in pg_amop , pg_amproc , and pg_opclass .

Read article