0

I am trying to analyze the reason for exceptions/ failures during the Ldap search. I am performing operations using JNDI on Active directory domain controller.

Here is the background for the things that I am trying to do:

  1. Using SASL (Kerberos authentication) using JAAS (KRB5LoginModule) to generate a LoginContext.
  2. Once the login is successful, LoginContext instance has the authenticated subject which has the kerberos ticket (TGT) populated in its PrivateCredentials
  3. After that, I generate the LdapContext using GSSAPI using the above authenticated Subject.
  4. Once the LdapContext is generated, I use it to perform JNDI operations (mostly search using paging)

  1. Till now everything is fine and LdapContext is generated correctly
  2. Some details of the Active Directory Domain Controller settings:
  3. The lifetime of the TGT is set to be 1 hour
  4. The lifetime of the service ticket(TGS) is set to be 10 minutes (required due to some constraints, but this is how it is)

Now the Scenario:

  • Using the the LdapContext created above, it starts querying the domain controller using the pagingcontrol and things work smooth for certain amount of time or certain amount of searches (lets say this so that I don't want you all to misguide that this might actually involve time, just consider this occurs after (approx.) regular intervals - those intervals could be time or searches

  • When it goes to get the next page after a certain interval, the search fails with :

       Caused by: javax.naming.CommunicationException: Connection reset
          at com.sun.jndi.ldap.LdapCtx.getSearchReply(LdapCtx.java:1920) ~[?:1.8.0_73]
          at com.sun.jndi.ldap.AbstractLdapNamingEnumeration.getNextBatch(AbstractLdapNamingEnumeration.java:130) ~[?:1.8.0_73]
          at com.sun.jndi.ldap.AbstractLdapNamingEnumeration.hasMoreImpl(AbstractLdapNamingEnumeration.java:217) ~[?:1.8.0_73]
          at com.sun.jndi.ldap.AbstractLdapNamingEnumeration.hasMore(AbstractLdapNamingEnumeration.java:189) ~[?:1.8.0_73]
    
          ... 10 more
    
       Caused by: java.net.SocketException: Connection reset
          at java.net.SocketInputStream.read(SocketInputStream.java:209) ~[?:1.8.0_73]
          at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:1.8.0_73]
          at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) ~[?:1.8.0_73]
          at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) ~[?:1.8.0_73]
          at java.io.BufferedInputStream.read(BufferedInputStream.java:345) ~[?:1.8.0_73]
          at com.sun.jndi.ldap.sasl.SaslInputStream.readFully(SaslInputStream.java:166) ~[?:1.8.0_73]
          at com.sun.jndi.ldap.sasl.SaslInputStream.fill(SaslInputStream.java:123) ~[?:1.8.0_73]
          at com.sun.jndi.ldap.sasl.SaslInputStream.read(SaslInputStream.java:90) ~[?:1.8.0_73]
          at com.sun.jndi.ldap.Connection.run(Connection.java:860) ~[?:1.8.0_73]
          at java.lang.Thread.run(Thread.java:745) ~[?:1.8.0_73]
    

At the same time, I see following event logs on Active Directory domain controller: EventId: 2889

 Log Name:      Directory Service
 Source:        Microsoft-Windows-ActiveDirectory_DomainService
 Event ID:      2889
 Task Category: LDAP Interface
 Level:         Information
 Keywords:      Classic
 User:          ANONYMOUS LOGON
 Computer:      myad01.example.lab
 Description:The following client performed a SASL (Negotiate/Kerberos/NTLM/Digest) LDAP bind without
 requesting signing (integrity verification), or performed a simple bind over a clear text (non-
 SSL/TLS-encrypted) LDAP connection. 

 Client IP address: X.X.X.X:56260 
 Identity the client attempted to authenticate as:EXAMPLE\Administrator 
 Binding Type:0

I also see a log with EventID: 1216. The details are as follows:

 Log Name:      Directory Service
 Source:        Microsoft-Windows-ActiveDirectory_DomainService
 Event ID:      1216
 Task Category: LDAP Interface
 Level:         Warning
 Keywords:      Classic
 User:          N/A
 Computer:      myad01.example.lab
 Description:Internal event: An LDAP client connection was closed because of an error. 

 Client IP:X.X.X.X:56244 

 Additional Data 
 Error value: 1236 The network connection was aborted by the local system. 
 Internal ID: c060420

My understanding: Whenever (after some interval) it goes to get the next page the ldap connection is invalidated by the server ( as suggested by the event id 1216) due to which I am getting the CommunicationException . My question is Why am I getting this after certain interval and not immediately? Is it because the validity of the kerberos and service tickets is over? If this is the case, then how should I design to overcome my paging issues? Because, after getting communication exception, if I create a new LdapContext and set the paging control I get following exception as expected:

  javax.naming.OperationNotSupportedException: [LDAP: error code 12 - 00000057: LdapErr: DSID-0C090B0B, comment: Error processing control, data 0, v3839 ]
     at com.sun.jndi.ldap.LdapCtx.mapErrorCode(Unknown Source) ~[?:1.8.0_201]
     at com.sun.jndi.ldap.LdapCtx.processReturnCode(Unknown Source) ~[?:1.8.0_201]
     at com.sun.jndi.ldap.LdapCtx.processReturnCode(Unknown Source) ~[?:1.8.0_201]
     at com.sun.jndi.ldap.LdapCtx.searchAux(Unknown Source) ~[?:1.8.0_201]
     at com.sun.jndi.ldap.LdapCtx.c_search(Unknown Source) ~[?:1.8.0_201]
     at com.sun.jndi.toolkit.ctx.ComponentDirContext.p_search(Unknown Source) ~[?:1.8.0_201]
     at com.sun.jndi.toolkit.ctx.PartialCompositeDirContext.search(Unknown Source) ~[?:1.8.0_201]
     at javax.naming.directory.InitialDirContext.search(Unknown Source) ~[?:1.8.0_201]

It is really important for me to have both the support - SASL(kerberos) for authentication and GSSAPI for LdapContext creation. Also, Paging is important as the data is huge and we can't have any restrictions on ticket validity as I can't control the cutomers' environments!

Please provide me with pointers on how to debug this issue further and suggest a proper way or a workaround (sorry for this, but need it anyway) to solve this issue.

1 Answers1

1

My observations might help someone to analyze their issue. First of all Event ID:1216 is generated on the Active Directory when the client (as in my question, I have mentioned JNDI, in this case the client is nothing but the LdapContext/DirContext) closes its underlying Socket.Have a look at this link.

LdapContext is nothing but a connection, formed using certain connection settings, between a client(for ex: JNDI) and the LDAP server(for ex: Active Directory Directory services). When there is a connection between any two entities on network, it is usually backed by the formation of Client side socket and a server side socket. In case of LdapContext as well, LdapContext has an underlying socket.

While using GSSAPI for getting the instance of LdapContext, the underlying socket comes with a timeout equal to the validity the service ticket lifetime setting present on the AD DS. Once the validity/ timeout of the underlying socket ends, the socket gets closed. If, the LdapContext then tries to query AD DS then the above mentioned Caused by: javax.naming.CommunicationException: Connection reset exception occurs and communication fails.

As, the lifetime/ validity settings for TGT and TGS are on AD DS, there is no way to by pass them using GSSAPI. If incase there is a requirement such that it is required to use the LdapContext for more time, then the only way out is to increase the validity of the respective tickets.