雄性激素过高是什么原因| 乐观是什么意思| 白带是什么东西| 睡觉憋气是什么原因引起的| 浅表性胃炎伴糜烂用什么药| 地中海是什么意思| 晚上吃什么不发胖| 仙草粉是什么做的| 来龙去脉指什么生肖| 吃什么降火| 隔离和防晒有什么区别| 溢于言表是什么意思| 地接是什么意思| 黯淡是什么意思| 点映什么意思| 中图分类号是什么| 荆芥不能和什么一起吃| 中耳炎吃什么药效果好| 梦见买黄金是什么兆头| zoom 是什么意思| 化学性肝损伤是什么意思| 玉米有什么功效| 牙龈变黑是什么原因| 为什么会梦到蛇| on是什么牌子| 肌酐低是什么意思啊| 补肾吃什么食物| 胃底腺息肉什么意思| 雷声什么| 人参归脾丸和归脾丸有什么区别| 舌头上有黑点是什么原因| 四川耙耳朵是什么意思| 中午吃什么饭 家常菜| 姐妹是什么意思| 什么药补肾壮阳| 老子是什么朝代的人| 染发膏用什么能洗掉| 牙痛用什么药止痛快| 尿路感染吃什么药效果最好| 蒸馏水是什么水| 黄芪什么人不能吃| 冰糖里面为什么有白线| 吃西红柿有什么好处和坏处| 药品经营与管理学什么| hpa是什么意思| 康复治疗学主要学什么| 男性性功能减退吃什么药| 鸽子喜欢吃什么食物| 月光蓝是什么颜色| 脚冷是什么原因| 血管堵塞吃什么好| 敲锣打鼓是什么生肖| 心率低有什么危害| 胆碱能性荨麻疹吃什么药| 父亲节是什么时候| 用黄瓜敷脸有什么功效| stomach什么意思| 练字用什么笔好| 宗师是什么意思| 松花蛋是什么蛋做的| 新疆人为什么长得像外国人| 用什么药膏能拔去粉瘤| 温碧泉适合什么年龄| 胸痛吃什么药| 黑鱼不能和什么一起吃| 慢热型是什么意思| 迷你巴拉巴拉和巴拉巴拉什么关系| 为什么会长疣| 梦见蜘蛛网是什么意思| 吃什么降胆固醇最快| 老是打喷嚏是什么原因| 科目三为什么这么难| 证明是什么意思| 人间烟火什么意思| 月经不规律吃什么药调理| 舍本逐末是什么意思| 三点水一个前读什么| 乳头痛是什么征兆| 立碑有什么讲究和忌讳| 断肠草长什么样| 埋线是什么| 吃鱼油有什么好处| 十余载是什么意思| 糖尿病吃什么主食最好| 阑尾炎属于什么科室| 肚子疼吃什么消炎药| 顺钟向转位是什么意思| 头孢不能和什么食物一起吃| 女的排卵期一般是什么时间| 屁股疼是什么原因| 八月份是什么季节| 什么叫| 石斛有什么副作用| 麻烦是什么意思| 感恩节为什么要吃火鸡| 灌肠什么意思| 反应性细胞改变炎症是什么意思| mi是什么单位| 蝎子的天敌是什么| 天使什么意思| 银耳钉变黑了说明什么| 排卵期后是什么期| 直系亲属为什么不能输血| 张飞为什么不救关羽| 美甲光疗是什么| 白色念珠菌是什么病| 妇科病是什么| 小资生活是什么意思| 4月26日什么星座| 什么操场| 右侧附件区囊性回声是什么意思| spao是什么牌子| 耳后长痣代表什么意思| 冬天穿什么| 记忆力差是什么原因| 手指脱皮是什么原因引起的| 什么是窝沟封闭| 弱阳性是什么意思| 什么是开悟| maybach是什么车| 侄女叫我什么| 水猴子长什么样| 泡面吃多了有什么危害| 梦到牛是什么意思| 足交什么感觉| 牙龈为什么会肿痛| 女攻男受是什么意思| 为什么小孩子有白头发| 马女和什么属相最配| 胃窦隆起是什么意思| 跟泰迪很像的狗叫什么| 11.22是什么星座| 胃胀吃什么| 勾芡用什么粉| 彼岸花开是什么意思| 中宫是什么意思| 名声是什么意思| 丙肝阳性是什么意思呢| 云南在古代叫什么| 偶数是什么| 布灵布灵是什么意思| 李耳为什么叫老子| 天蝎座和什么座最配| 裸花紫珠是主治什么病| 喝酒后腰疼是什么原因| 身上有红色的小红点是什么原因| 什么首阔步| 筋膜炎是什么| 弱水三千只取一瓢什么意思| 骶椎腰化什么意思| 高血压是什么病| 什么是无机盐| 什么是负离子| 度是什么意思| 经常打嗝是什么原因| 喝隔夜茶有什么好处和坏处| 馒头是什么做的| 中风是什么| 女人手心热吃什么调理| 梦见两条大蟒蛇是什么征兆| 股长是什么级别| 颈椎看什么科| 东南属什么五行| 什么年龄割双眼皮最好| 万加一笔是什么字| 真菌镜检阳性是什么意思| 什么是中产阶级| 什么的麦田| ca724偏高是什么意思| 月亮是什么星| 左大腿外侧麻木是什么原因| 俄罗斯人是什么人种| 活着的意义是什么| 动脉圆锥是什么意思| 3月10日什么星座| 边界感是什么意思| 筑基是什么意思| 为什么不来大姨妈也没有怀孕| 二婚是什么意思| 女生下体长什么样| 红色的月亮是什么征兆| 腰肌劳损是什么原因造成的| 梦到拔牙是什么预兆| 异丙醇是什么东西| 1976年出生属什么生肖| 孽障是什么意思| 女性尿路感染吃什么药效果好| 林冲属于什么生肖| 青是什么颜色| 脂肪燃烧是什么感觉| 猫的耳朵有什么作用| 2002年是什么年| 食禄痣是什么意思| 手背肿是什么原因| 玉兰花什么时候开| 后宫是什么意思| 脸上爱长痘痘是什么原因| 政治面貌填什么| 低血压高什么原因| 热敷肚子有什么好处| 丁丁是什么意思| 土贝什么字| 白斑是什么原因引起的| 晚上八点到九点是什么时辰| 三位一体是什么意思| 手一直脱皮是什么原因| 什么茶降血压效果最好| 舌头有齿痕吃什么药| 插入阴道是什么感觉| 分割线是什么意思| 燕子吃什么食物| 餐后血糖高吃什么药| 金鱼藻属于什么植物| 脉压差大是什么原因| 视力模糊是什么原因| 芥末油是什么提炼出来的| 除草剂中毒有什么症状| 9k金是什么意思| 植物神经紊乱的症状吃什么药| 急性胃炎吃什么食物好| 洗漱是什么意思| 硫酸亚铁是什么颜色| 摆渡人什么意思| 手术后吃什么恢复快| 申字五行属什么| 散仙是什么意思| 描述是什么意思| 凌晨两点多是什么时辰| 骨质疏松用什么药好| 献血对身体有什么好处| 精梳棉是什么面料| 炒米是什么米做的| 奥利奥是什么意思| 楼凤是什么意思| 6月18号什么星座| 解脲脲原体阳性是什么病| 上皮源性肿瘤什么意思| 鹌鹑吃什么食物| 低血压要注意些什么| outdoor是什么意思| 尿泡沫多是什么原因| 肠息肉有什么症状| 为什么清真不吃猪肉| 02年属什么生肖| modal是什么意思| 什么是预防医学| 谨记教诲是什么意思| 气血不足吃什么中成药| 霜对什么| 绞股蓝和什么搭配喝减肥| 荷花什么时候开| 斯字五行属什么| 实习期扣分有什么影响| 一生无虞是什么意思| 车加昆念什么| 菠萝炒什么好吃| 不宜是什么意思| 非淋菌性尿道炎吃什么药最好| 高姿属于什么档次| 扁平苔藓是什么原因引起的| 蜻蜓属于什么类动物| 胆经不通吃什么中成药| 红色加蓝色是什么颜色| 梦见摘杏子是什么意思| 百度Jump to content

From Wikipedia, the free encyclopedia
Extensible Storage Engine
Other namesJET Blue
Developer(s)Microsoft
Initial release1994; 31 years ago (1994)
Repository
Written inC++
Operating systemMicrosoft Windows
PlatformIA-32, x86-64, ARM and Itanium (and historically DEC Alpha, MIPS, and PowerPC)
TypeDatabase engine
LicenseMIT License
Websitedocs.microsoft.com/en-us/windows/win32/extensible-storage-engine/extensible-storage-engine Edit this on Wikidata
百度 24日白天随着垂直扩散条件改善,京津冀等地的霾减弱消散。

Extensible Storage Engine (ESE), also known as JET Blue, is an ISAM (indexed sequential access method) data storage technology from Microsoft. ESE is the core of Microsoft Exchange Server, Active Directory, and Windows Search. It is also used by a number of Windows components including Windows Update client and Help and Support Center. Its purpose is to allow applications to store and retrieve data via indexed and sequential access.

ESE provides transacted data update and retrieval. A crash recovery mechanism is provided so that data consistency is maintained even in the event of a system crash. Transactions in ESE are highly concurrent making ESE suitable for server applications. ESE caches data intelligently to ensure high performance access to data. In addition, ESE is lightweight making it suitable for auxiliary applications.

The ESE Runtime (ESENT.DLL) has shipped in every Windows release since Windows 2000, with native x64 version of the ESE runtime shipping with x64 versions of Windows XP and Windows Server 2003. Microsoft Exchange, up to Exchange 2003 shipped with only the 32-bit edition, as it was the only supported platform. With Exchange 2007, it ships with the 64-bit edition.

Databases

[edit]

A database is both a physical and logical grouping of data. An ESE database looks like a single file to Windows. Internally the database is a collection of 2, 4, 8, 16, or 32 KB pages (16 and 32 KB page options are only available in Windows 7 and Exchange 2010),[1] arranged in a balanced B-tree structure.[2] These pages contain meta-data to describe the data contained within the database, data itself, indexes to persist interesting orders of the data, and other information. This information is intermixed within the database file but efforts are made to keep data used together clustered together within the database. An ESE database may contain up to 232 pages, or 16 terabytes of data,[3] for 8 kilobyte sized pages.

ESE databases are organized into groups called instances. Most applications use a single instance, but all applications can also use multiple instances. The importance of the instance is that it associates a single recovery log series with one or more databases. Currently, up to 6 user databases may be attached to an ESE instance at any time. Each separate process using ESE may have up to 1024 ESE instances.

A database is portable in that it can be detached from one running ESE instance and later attached to the same or a different running instance. While detached, a database may be copied using standard Windows utilities. The database cannot be copied while it is being actively used since ESE opens database files exclusively. A database may physically reside on any device supported for directly addressable I/O operations by Windows.

Tables

[edit]

A table is a homogeneous collection of records, where each record has the same set of columns. Each table is identified by a table name, whose scope is local to the database in which the table is contained. The amount of disk space allocated to a table within a database is determined by a parameter given when the table is created with the CreateTable operation. Tables grow automatically in response to data creation.

Tables have one or more indexes. There must be at least one clustered index for record data. When no clustered index is defined by the application, an artificial index is used which orders and clusters records by the chronological order of record insertion. Indexes are defined to persist interesting orders of data, and allow both sequential access to records in index order, and direct access to records by index column values. Clustered indexes in ESE must also be primary, meaning that the index key must be unique.

Clustered and non-clustered indexes are represented using B+ trees. If an insert or update operation causes a page to overflow, the page is split: a new page is allocated and is logically chained in between the two previously adjacent pages. Since this new page is not physically adjacent to its logical neighbors, access to it is not as efficient. ESE has an on-line compaction feature that re-compacts data. If a table is expected to be frequently updated, space may be reserved for future insertions by specifying an appropriate page density when creating a table or index. This allows split operations to be avoided or postponed.

Records and columns

[edit]

A record is an associated set of column values. Records are inserted and updated via Update operations and can be deleted via Delete operations. Columns are set and retrieved via SetColumns and RetrieveColumns operations, respectively. The maximum size of a record is 8110 bytes for 8 kilobyte pages with the exception of long value columns. Column types of LongText and LongBinary do not contribute significantly to this size limitation, and records can hold data much larger than a database page size when data is stored in long value columns. When a long value reference is stored in a record, only 9 bytes of in-record data are required. These long values may themselves be up to 2 gigabytes (GB) in size.

Records are typically uniform in that each record has a set of values for the same set of columns. In ESE, it is also possible to define many columns for a table, and yet have any given record contain only a small number of non-NULL column values. In this sense, a table can also be a collection of heterogeneous records.

ESE supports a wide range of columns values, ranging in size from 1-bit to 2 GB. Choosing the correct column type is important because the type of a column determines many of its properties, including its ordering for indexes. The following data types are supported by ESE:

Column types

[edit]
Name Description
Bit ternary value (NULL, 0, or 1)
Unsigned Byte 1-byte unsigned integer
Short 2-byte signed integer
Unsigned Short 2-byte unsigned integer
Long 4-byte signed integer
Unsigned Long 4-byte unsigned integer
LongLong 8-byte signed integer
UnsignedLongLong 8-byte unsigned integer
Currency 8-byte signed integer
IEEE Single 4-byte floating-point number
IEEE Double 8-byte floating-point number
DateTime 8-byte date-time (integral date, fractional time)
GUID 16-byte unique identifier
Binary Binary string, length <= 255 bytes
Text ANSI or Unicode string, length <= 255 bytes
Long Binary Large binary string, length < 2 GB
Long Text Large ANSI or Unicode string, length < 2 GB

Fixed, variable and tagged columns

[edit]

Each ESE table can define up to 127 fixed length columns, 128 variable length columns and 64,993 tagged columns.

  • Fixed columns are essentially columns that take up the same amount of space in each record, regardless of their value. Fixed columns take up a 1-bit to represent NULLity of the column value and a fixed amount of space in each record in which that column, or a later defined fixed column, is set.
  • Variable columns are essentially columns that take up a variable amount of space in each record in which they are set, depending upon the size of the particular column value. Variable columns take up 2-bytes to determine NULLity and size, and a variable amount of space in each record in which that column is set.
  • Tagged columns are columns that take no space whatsoever if they are not set in a record. They may be single valued but can also be multi-valued. The same tagged column may have multiple values in a single record. When tagged columns are set in a record, each instance of a tagged column takes approximately 4-bytes of space in addition to the size of the tagged column instance value. When the number of instances of a single tagged column is large, the overhead per tagged column instance is approximately 2-bytes. Tagged columns are ideal for sparse columns because they take no space whatsoever if they are not set. If a multi-valued tagged column is indexed, the index will contain one entry for the record for each value of the tagged column.

For a given table, columns fall into one of two categories: those which either occur exactly once in each of the records, with possibly a few NULL values; and those which occur rarely, or which may have multiple occurrences in a single record. Fixed and variable columns belong to the former category, while tagged columns belong to the latter. The internal representation of the two column categories is different, and it is important to understand the trade offs between the column categories. Fixed and variable columns are typically represented in every record, even when the occurrence has a NULL value. These columns can be quickly addressed via an offset table. Tagged column occurrences are preceded by a column identifier and the column is located by binary searching the set of tagged columns.

Long values

[edit]

Column types of Long Text and Long Binary are large binary objects. They are stored in separate B+tree from the clustered index keyed by long value id and byte offset. ESE supports append, byte range overwrite, and set size for these columns. Also, ESE has a single instance store feature where multiple records may reference the same large binary object, as though each record had its own copy of the information, i.e. without inter-record locking conflicts. The maximum size of a Long Text or Long Binary column value is 2 GB.

Version, auto-increment and escrow columns

[edit]

Version columns are automatically incremented by ESE each time a record containing this column is modified via an Update operation. This column cannot be set by the application, but can only be read. Applications of version columns include being used to determine if an in-memory copy of a given record needs to be refreshed. If the value in a table record is greater than the value in a cached copy then the cached copy is known to be out of date. Version columns must be of type Long.

Auto increment columns are automatically set by ESE such that the value contained in the column is unique for every record in the table. These columns, like version columns, cannot be set by the application. Auto increment columns are read only, and are automatically set when a new record is inserted into a table via an Update operation. The value in the column remains constant for the life of the record, and only one auto increment column is allowed per table. Auto increment columns may be of type Long or type Currency.

Escrow columns can be modified via an EscrowUpdate operation. Escrowed updates are numeric delta operations. Escrow columns must be of type Long. Examples of numeric delta operations include adding 2 to a value or subtracting 1 from a value. ESE tracks the change in a value rather than the end value of an update. Multiple sessions may each have outstanding changes made via EscrowUpdate to the same value because ESE can determine the actual end value regardless of which transactions commit and which transactions rollback. This allows multiple users to concurrently update a column by making numeric delta changes. Optionally, database engine can erase records with zero value of the column. A common use for such escrow column is reference counter: many threads increment/decrement the value without locks, and when the counter reaches zero, the record automatically gets deleted.

Indexes

[edit]

An index is a persisted ordering of records in a table. Indexes are used for both sequential access to rows in the order defined, and for direct record navigation based on indexed column values. The order defined by an index is described in terms of an array of columns, in precedence order. This array of columns is also called the index key. Each column is called an index segment. Each index segment may be either ascending or descending, in terms of its ordering contribution. Any number of indexes may be defined for a table. ESE provides a rich set of indexing features.

Clustered indexes

[edit]

One index may be specified as the clustered, or primary, index. In ESE, the clustered index must be unique and is referred to as the primary index. Other indexes are described as non-clustered, or secondary, indexes. Primary indexes are different from secondary indexes in that the index entry is the record itself, and not a logical pointer to the record. Secondary indexes have primary keys at their leaves to logically link to the record in the primary index. In other words, the table is physically clustered in primary index order. Retrieval of non-indexed record data in primary index order is generally much faster than in secondary index order. This is because a single disk access can bring into memory multiple records that will be access close together in time. The same disk access satisfies multiple record access operations. However, the insertion of a record into the middle of an index, as determined by the primary index order, may be very much slower than appending it to the end of an index. Update frequency must be carefully considered against retrieval patterns when performing table design. If no primary index is defined for a table, then an implicit primary index, called a database key (DBK) index is created. The DBK is simply a unique ascending number incremented each time a record is inserted. As a result, the physical order of records in a DBK index is chronological insertion order, and new records are always added at the end of the table. If an application wishes to cluster data on a non-unique index, this is possible by adding an autoincrement column to the end of the non-unique index definition.

Indexing over multi-valued columns

[edit]

Indexes can be defined over multi-valued columns. Multiple entries may exist in these indexes for records with multiple values for the indexed column. Multi-valued columns may be indexed in conjunction with single valued columns. When two or more multi-valued columns are indexed together, then the multi-valued property is only honored for the first multi-value column in the index. Lower precedence columns are treated as though they were single valued.

Sparse indexes

[edit]

Indexes can also be defined to be sparse. Sparse indexes do not have at least one entry for each record in the table. There are a number of options in defining a sparse index. Options exist to exclude records from indexes when an entire index key is NULL, when any key segment is NULL or when just the first key segment is NULL. Indexes can also have conditional columns. These columns never appear within an index but can cause a record not to be indexed when the conditional column is either NULL or non-NULL.

Tuple indexes

[edit]

Indexes can also be defined to include one entry for each sub-string of a Text or Long Text column. These indexes are called tuple indexes. They are used to speed queries with sub-string matching predicates. Tuple indexes can only be defined for Text columns. For example, if a Text column value is “I love JET Blue”, and the index is configured to have a minimum tuple size of 4 characters and a maximum tuple length of 10 characters, then the following sub-strings will be indexed:

“I love JET”

“ love JET ”
“love JET B”
“ove JET Bl”
“ve JET Blu”
“e JET Blue”
“ JET Blue”
“JET Blue”
“ET Blue”
“T Blue”
“ Blue”
“Blue”

Even though tuple indexes can be very large, they can significantly speed queries of the form: find all records containing “JET Blue”. They can be used for sub-strings longer than the maximum tuple length by dividing the search sub-string into maximum tuple length search strings and intersecting the results. They can be used for exact matches for strings as long as the maximum tuple length or as short as the minimum tuple length, with no index intersection. For more information on performing index intersection in ESE see Index Intersection. Tuple indexes cannot speed queries where the search string is shorter than the minimum tuple length.

Transactions

[edit]

A transaction is a logical unit of processing delimited by BeginTransaction and CommitTransaction, or Rollback, operations. All updates performed during a transaction are atomic; they either all appear in the database at the same time or none appear. Any subsequent updates by other transactions are invisible to a transaction. However, a transaction can update only data that has not changed in the meantime; else the operation fails at once without waiting. Read-only transactions never need to wait, and update transactions can interfere only with one another updating transaction. Transactions which are terminated by Rollback, or by a system crash, leave no trace on the database. In general, the data state is restored on Rollback to what it was prior to BeginTransaction.

Transactions may be nested up to 7 levels, with one additional level reserved for ESE internal use. This means that a part of a transaction may be rolled back, without need to roll back the entire transaction; a CommitTransaction of a nested transaction merely signifies the success of one phase of processing, and the outer transaction may yet fail. Changes are committed to the database only when the outermost transaction is committed. This is known as committing to transaction level 0. When the transaction commits to transaction level 0, data describing the transaction is synchronously flushed to the log to ensure that the transaction will be completed even in the event of a subsequent system crash. Synchronously flushing the log makes ESE transactions durable. However, in some cases application wish to order their updates, but not immediately guarantee that changes will be done. Here, applications can commit changes with JET_bitIndexLazyFlush.

ESE supports a concurrency control mechanism called multi-versioning. In multi-versioning, every transaction queries a consistent view of the entire database as it was at the time the transaction started. The only updates it encounters are those made by it. In this way, each transaction operates as though it was the only active transaction running on the system, except in the case of write conflicts. Since a transaction may make changes based on data read that has already been updated in another transaction, multi-versioning by itself does not guarantee serializable transactions. However, serializability can be achieved when desired by simply using explicit record read locks to lock read data that updates are based upon. Both read and write locks may be explicitly requested with the GetLock operation.

In addition, an advanced concurrency control feature known as escrow locking is supported by ESE. Escrow locking is an extremely concurrent update where a numeric value is changed in a relative fashion, i.e. by adding or subtracting another numeric value. Escrow updates are non-conflicting even with other concurrent escrow updates to the same datum. This is possible because the operations supported are commutable and can be independently committed or rolled back. As a result, they do not interfere with concurrent update transactions. This feature is often used for maintained aggregations.

ESE also extends transaction semantics from data manipulation operations to data definition operations. It is possible to add an index to a table and have concurrently running transactions update the same table without any transaction lock contention whatsoever. Later, when these transactions are complete, the newly created index is available to all transactions and has entries for record updates made by other transactions that could not perceive the presence of the index when the updates took place. Data definition operations may be performed with all the features expected of the transaction mechanism for record updates. Data definition operations supported in this fashion include AddColumn, DeleteColumn, CreateIndex, DeleteIndex, CreateTable and DeleteTable.

Cursor navigation and the copy buffer

[edit]

A cursor is a logical pointer within a table index. The cursor may be positioned on a record, before the first record, after the last record or even between records. If a cursor is positioned before or after a record, there is no current record. It is possible to have multiple cursors into the same table index. Many record and column operations are based on the cursor position. Cursor position can be moved sequentially by Move operations or directly using index keys with Seek operations. Cursors can also be moved to a fractional position within an index. In this way, the cursor can be quickly moved to a thumb bar position. This operation is performed with the same speed as a Seek operation. No intervening data must be accessed.

Each cursor has a copy buffer in order to create a new record, or modify an existing record, column by column. This is an internal buffer whose contents can be changed with SetColumns operations. Modifications of the copy buffer do not automatically change the stored data. The contents of the current record can be copied into the copy buffer using the PrepareUpdate operation, and Update operations store the contents of the copy buffer as a record. The copy buffer is implicitly cleared on a transaction commit or rollback, as well as on navigation operations. RetrieveColumns may be used to retrieve column data either from the record or from the copy buffer, if one exists.

Query processing

[edit]

ESE applications invariably query their data. This section of the document describes features and techniques for applications to write query procession logic on ESE.

Sorts and temporary tables

[edit]

ESE provides a sort capability in the form of temporary tables. The application inserts data records into the sort process one record at a time, and then retrieves them one record at a time in sorted order. Sorting is actually performed between the last record insertion and the first record retrieval. Temporary tables can be used for partial and complete result sets as well. These tables can offer the same features as base tables including the ability to navigate sequentially or directly to rows using index keys matching the sort definition. Temporary tables can also be updatable for computation of complex aggregates. Simple aggregates can be computed automatically with a feature similar to sorting where the desired aggregate is a natural result of the sort process.

Covering indexes

[edit]

Retrieving column data directly from secondary indexes is an important performance optimization. Columns may be retrieved directly from secondary indexes, without accessing the data records, via the RetrieveFromIndex flag on the RetrieveColumns operation. It is much more efficient to retrieve columns from a secondary index, than from the record, when navigating by the index. If the column data were retrieved from the record, then an additional navigation is necessary to locate the record by the primary key. This may result in additional disk accesses. When an index provides all columns needed then it is called a covering index. Note that columns defined in the table primary index are also found in secondary indexes and can be similarly retrieved using JET_bitRetrieveFromPrimaryBookmark.

Index keys are stored in normalized form which can be, in many cases, denormalized to the original column value. Normalization is not always reversible. For example, Text and Long Text column types cannot be denormalized. In addition, index keys may be truncated when column data is very long. In cases where columns cannot be retrieved directly from secondary indexes, the record can always be accessed to retrieve the necessary data.

Index intersection

[edit]

Queries often involve a combination of restrictions on data. An efficient means of processing a restriction is to use an available index. However, if a query involves multiple restrictions then applications often process the restrictions by walking the full index range of the most restrictive predicate satisfied by a single index. Any remaining predicate, called the residual predicate, is processed by applying the predicate to the record itself. This is a simple method but has the disadvantage of potentially having to perform many disk accesses to bring records into memory to apply the residual predicate.

Index intersection is an important query mechanism in which multiple indexes are used together to more efficiently process a complex restriction. Instead using only a single index, index ranges on multiple indexes are combined to result in a much smaller number of records on which any residual predicate can be applied. ESE makes this easy by supplying an IntersectIndexes operation. This operation accepts a series of index ranges on indexes from the same table and returns a temporary table of primary keys that can be used to navigate to the base table records that satisfy all index predicates.

Pre-joined tables

[edit]

A join is a common operation on a normalized table design, where logically related data is brought back together for use in an application. Joins can be expensive operations because many data accesses may be needed to bring related data into memory. This effort can be optimized in some cases by defining a single base table that contains data for two or more logical tables. The column set of the base table is the union of the column sets of these logical tables. Tagged columns make this possible because of their good handling of both multi-valued and sparse valued data. Since related data is stored together in the same record, it is accessed together thereby minimizing the number of disk accesses to perform the join. This process can be extended to a large number of logical tables as ESE can support up to 64,993 tagged columns. Since indexes can be defined over multi-valued columns, it is still possible to index ‘interior’ tables. However, some limitations exist and applications should consider pre-joining carefully before employing this technique.

Logging and crash recovery

[edit]

The logging and recovery feature of ESE supports guaranteed data integrity and consistency in the event of a system crash. Logging is the process of redundantly recording database update operations in a log file. The log file structure is very robust against system crashes. Recovery is the process of using this log to restore databases to a consistent state after a system crash.

Transaction operations are logged and the log is flushed to disk during each commit to transaction level 0. This allows the recovery process to redo updates made by transactions which commit to transaction level 0, and undo changes made by transactions which did not commit to transaction level 0. This type of recovery scheme is often referred to as a ‘roll-forward/roll-backward’ recovery scheme. Logs can be retained until the data is safely copied via a backup process described below, or logs can be reused in a circular fashion as soon as they are no longer needed for recovery from system crash. Circular logging minimizes the amount of disk space needed for the log but has implications on the ability to recreate a data state in the event of a media failure.

Backup and restore

[edit]

Logging and recovery also play a role in protecting data from media failure. ESE supports on-line backup where one or more databases are copied, along with log files in a manner that does not affect database operations. Databases can continue to be queried and updated while the backup is being made. The backup is referred to as a ‘fuzzy backup’ because the recovery process must be run as part of backup restoration to restore a consistent set of databases. Both streaming and shadow copy backup are supported.

Streaming backup is a backup method where copies of all desired database files and the necessary log files are made during the backup process. File copies may be saved directly to tape or can be made to any other storage device. No quiescing of activity of any kind is required with streamed backups. Both the database and log files are check summed to ensure that no data corruptions exist within the data set during the backup process. Streaming backups may also be incremental backups. Incremental backups are ones in which only the log files are copied and which can be restored along with a previous full backup to bring all databases to a recent state.

Shadow copy backups are a new high speed backup method. Shadow copy backups are dramatically faster because the copy is virtually made after a brief period of quiescing an application. As subsequent updates are made to the data, the virtual copy is materialized. In some cases, hardware support for shadow copy backups means that actually saving the virtual copies is unnecessary. Shadow copy backups are always full backups.

Restore can be used to apply a single backup, or it can be used to apply a combination of a single full backup with one or more incremental backups. Further, any existing log files can be replayed as well to recreate an entire data set all the way up to the last transaction logged as committed to transaction level 0. Restoration of a backup can be made to any system capable of supporting the original application. It need not be the same machine, or even the same machine configuration. Location of files can be changed as part of the restoration process.

Backup and restore to different hardware

[edit]

When an ESENT database is created, the physical disk sector size is stored with the database. The physical sector size is expected to remain consistent between sessions; otherwise, an error is reported. When a physical drive is cloned or restored from a drive image to a drive that uses a different physical sector size (Advanced Format Drives), ESENT will report errors.[4]

This is a known issue and Microsoft has hot fixes available. For Windows Vista or Windows Server 2008 see KB2470478.[5] For Windows 7 or Windows Server 2008 R2 see KB982018.[6]

History

[edit]

JET Blue was originally developed by Microsoft as a prospective upgrade for the JET Red database engine in Microsoft Access, but was never used in this role. Instead, it went on to be used by Exchange Server, Active Directory, File Replication Service (FRS), Security Configuration Editor, Certificate Services, Windows Internet Name Service (WINS) and a host of other Microsoft services, applications and Windows components.[7] For years, it was a private API used by Microsoft only, but has since become a published API that anyone can use.

Work began on Data Access Engine (DAE) in March 1989 when Allen Reiter joined Microsoft. Over the next year a team of four developers worked for Allen to largely complete the ISAM. Microsoft already had the BC7 ISAM (JET Red) but began the Data Access Engine (DAE) effort to build a more robust database engine as an entry in the then new client-server architecture realm. In the spring of 1990, BC7 ISAM and DAE teams were joined to become the Joint Engine Technology (JET) effort; responsible for producing two engines a v1 (JET Red) and a v2 (JET Blue) that would conform to the same API specification (JET API). DAE became JET Blue for the color of the flag of Israel. BC7 ISAM became JET Red for the color of the flag of Russia. While JET Blue and JET Red were written to the same API specification, they shared no ISAM code whatsoever. They did both support a common query processor, QJET, which later together with the BC7 ISAM became synonymous with JET Red.

JET Blue first shipped in 1994 as an ISAM for WINS, DHCP, and the now defunct RPL services in Windows NT 3.5. It shipped again as the storage engine for Microsoft Exchange in 1996. Additional Windows services chose JET Blue as their storage technology and by 2000 every version of Windows began to ship with JET Blue. JET Blue was used by Active Directory and became part of a special set of Windows code called the Trusted Computing Base (TCB). The number of Microsoft applications using JET Blue continues to grow and the JET Blue API was published in 2005 to facilitate usage by an ever-increasing number of applications and services both within and beyond Windows.

A Microsoft Exchange Web Blog entry[8] stated that developers who have contributed to JET Blue include Cheen Liao, Stephen Hecht, Matthew Bellew, Ian Jose, Edward "Eddie" Gilbert, Kenneth Kin Lum, Balasubramanian Sriram, Jonathan Liem, Andrew Goodsell, Laurion Burchall, Andrei Marinescu, Adam Foxman, Ivan Trindev, Spencer Low and Brett Shirley.

In January 2021 Microsoft open sourced ESE.[9] It was posted to GitHub with the permissive MIT License.

Comparison to JET Red

[edit]

While they share a common lineage, there are vast differences between JET Red and ESE.

  • JET Red is a file sharing technology, while ESE is designed to be embedded in a server application, and does not share files.
  • JET Red makes best effort file recovery, while ESE has write ahead logging and snapshot isolation for guaranteed crash recovery.
  • JET Red before version 4.0 supports only page-level locking, while ESE and JET Red version 4.0 supports record-level locking.
  • JET Red supports a wide variety of query interfaces, including ODBC and OLE DB. ESE does not ship with a query engine but instead relies on applications to write their own queries as C ISAM code.
  • JET Red has a maximum database file size of 2 GiB, while ESE has a maximum database file size of 8 TiB with 4 KiB pages, and 16 TiB with 8 KiB pages.

References

[edit]
  1. ^ In this context 1 KB = 1024 B
  2. ^ "Extensible Storage Engine Architecture". TechNet. Retrieved 2025-08-06.
  3. ^ In this context 1 TB = 10244 B
  4. ^ "Acronis Products: Applications Build on ESENT Running on Windows Vista, Windows Server 2008 and Windows 7 may not work correctly after restoring or cloning to a drive with different physical sector | Knowledge Base". kb.acronis.com.
  5. ^ "Applications that are built on ESENT and that run on a Windows Vista-based or Windows Server 2008-based computer may not work correctly after the reported physical sector size of the storage device changes". Archived from the original on 2025-08-06. Retrieved 2025-08-06.
  6. ^ "An update that improves the compatibility of Windows 7 and Windows Server 2008 R2 with Advanced Format Disks is available". support.microsoft.com.
  7. ^ "Extensible Storage Engine". Microsoft.
  8. ^ "Extensible Storage Engine". Retrieved 2025-08-06.
  9. ^ "Microsoft Open Sources ESE, the Extensible Storage Engine". 3 February 2021. Retrieved 2025-08-06.
[edit]
吃豆腐有什么好处 为什么喝咖啡会心慌 白细胞低有什么危险 南瓜为什么叫南瓜 结婚32年是什么婚
0元购是什么意思 怀孕可以喝什么饮料 子宫憩室是什么 改姓氏需要什么手续 颈椎生理曲度变直是什么意思
荷叶像什么比喻句 虾和什么相克 脑ct都能查出什么病 梦见很多蜜蜂是什么意思 千丝万缕是什么意思
入睡难是什么原因 香蕉皮擦脸有什么作用与功效 造诣是什么意思 糖抗原125高什么意思 草长莺飞是什么生肖
什么不绝wzqsfys.com 糟老头是什么意思hcv8jop7ns8r.cn 左心室舒张功能减退是什么意思hcv8jop8ns7r.cn _什么字hcv8jop1ns8r.cn 虾仁配什么蔬菜包饺子sanhestory.com
化险为夷的夷什么意思hcv8jop7ns3r.cn 女性出汗多是什么原因cj623037.com 守株待兔是什么生肖hcv8jop6ns6r.cn 2.5什么星座hcv7jop7ns4r.cn magnesium是什么意思luyiluode.com
孕妇喝椰子水有什么好处hcv9jop5ns1r.cn 肺结节手术后吃什么好hcv8jop5ns5r.cn 小孩腹泻吃什么药好得快hcv9jop1ns8r.cn 30岁属什么的生肖hcv8jop8ns0r.cn 视力模糊是什么原因hcv8jop6ns4r.cn
cua是什么意思hcv8jop7ns7r.cn 不在服务区是什么意思hcv9jop3ns6r.cn 尿黄是什么原因引起的男性hcv9jop4ns0r.cn 屁为什么是臭的hcv8jop2ns5r.cn 配菜是什么意思hcv7jop6ns0r.cn
百度