8 - Schema

Schema parsers

We use a set of classes to parse schema elements. There are 11 flavors of schema elements, 8 of them being described in a RFC, 3 of them being ApacheDS proprietary:


  • LdapComparator
  • Normalizer
  • SyntaxChecker

We need to be able to parse those schema elements because they can be added into the server as a description (ie, a String representing one of those schema elements as defined by the RFC). For the same reason, the LDAP API need to validate that those schema elements are valid before sending them to a LDAP SERVER, or to be able to properly parse what it gets from a LDAP server.

Strict vs quirks mode

Here we have a problem : most of the LDAP server implementation violate the RFC. We can’t simply expect the String representing a schema element to be compliant with the RFC. Some typical deviations are :

  • OpenLDAP uses some macro instead of OIDs. This is convenient, as it allows to define the root OID with a name, and reuse it in the associated schema elements
  • AD and many other servers expect some specific characters to be accepted, like ‘_', ‘:', ‘#', …
  • Sometime, the values may come without quotes, when it’s required
  • etc.

We will define the strict mode a mode which follows the RFC tightly, and the quirks mode a relaxed version of the parser, more permissive. One can use either the strict or relaxed mode using a flag.

Strict mode

The only thing we will relax is the order in which the various parts of each description is present in a schema description : we don’t expect them to be ordered as described in the RFC.

The various parts are defined using a few syntaxes :

  • NAME: qdescrs

  • DESC: qdstring

  • SUP (ObjectClass), MUST, MAY, APPLIES, AUX, NOT: oids

  • SUP (AttributeType), EQUALITY, ORDERING, SUBSTR, FORM, OC: oid

  • SYNTAX (AttributeType): noidlen

  • SYNTAX (MathingRule): numericoid

  • SUP (DitStructureRule): ruleids

  • descr: oid, qdescrs

  • qdescr: qdescrs, qdescrlist

qdescrs and oids may contain one or many qdescr and oid.

descr, strict

The descr construct is used by oid and qdescrs (an OID can be a name). The strict mode will use this grammar :

descr       ::= keystring
keystring   ::= leadkeychar keychar*
leadkeychar ::= ALPHA
keychar     ::= ALPHA | DIGIT | HYPHEN
ALPHA       ::= ['A'..'Z'] | ['a'..'z']
DIGIT       ::= ['0'..'9']
HYPHEN      ::= '-'
SQUOTE      ::= '\''

qdstring, strict

A qdstring can contain any type of UTF-8 characters, except the simple quote or the backslash, which must be encoded. It’s always surrounded by simple quotes :

qdstring    ::= SQUOTE dstring SQUOTE
dstring     ::= ( QS | QQ | QUTF8 )*
QQ          ::= ESC %x32 %x37
QS          ::= ESC %x35 ( %x43 / %x63 )
QUTF8       ::= QUTF1 | UTFMB
QUTF1       ::= %x00-26 | %x28-5B | %x5D-7F

qdescr, strict

qdescr is a quoted name, where the first char must be alphabetic, and the following chars must be alphabetic, digits or hyphen. Here is the ABNF for qdescr :

qdescr      ::= SQUOTE descr SQUOTE

noidlen, strict

Relaxed mode

qdstring, relaxed


descr, relaxed

The relaxed descr accepts more characters, like underscore, semi-colon, dot, colon or sharp. The leadkeychar will not be mandatory, too. Here is the ABNF we will accept :

relaxed-descr   ::= relaxed-keystring
leaxed-keystring::= keychar+
ALPHA           ::= ['A'..'Z'] | ['a'..'z']
DIGIT           ::= ['0'..'9']
HYPHEN          ::= '-'
UNDERSCORE      ::= '_'
SEMI_COLON      ::= ';'
COLON           ::= ':'
SDOT            ::= '.'
SHARP           ::= '#'

qdescr, relaxed

Compared to the strict mode, we will accept a non-quoted String, or a String using double quotes.

relaxed-qdescr  ::= SQUOTE relaxed-descr SQUOTE | DQUOTE relaxed-descr DQUOTE | relaxed-descr

oid, relaxed

We will accept quoted and double quoted OIDs and Names, in relaxed mode. Here is teh supported ABNF :

oid-relaxed ::= SQUOTE relaxed-descr SQUOTE | DQUOTE relaxed-descr DQUOTE | descr-relaxed |
                SQUOTE numericoid SQUOTE | DQUOTE numericoid DQUOTE | numericoid

noidlen, strict

Here, we will allow textual syntax name to be used, not only OIDs. For instance, something like SYNTAX IA5String will be allowed.

We also allow quoted and double quoted OIDs.