All Activity

johan

changed the Assignee to 'Johan Gardner' on OPENIDM-454, OPENIDM-453

06:30
andiandi (deleted user)

closed OPENIDM-553

03 Feb
Fixed in 2.0.x for 2.0.2 and trunk with r899 and r901.
andiandi (deleted user)

created OPENIDM-553

03 Feb
andiandi (deleted user)

closed OPENIDM-551

03 Feb
Checked into trunk with r920, 2.0.x branch for 2.0.2 with r921.
jamiefnelson

created OPENIG-7

03 Feb
jamiefnelson

resolved OPENIG-6

03 Feb
 fixed the project hierarchy by modifying the POM to refer to openig-saas as a parent
jamiefnelson

started progress on OPENIG-6

03 Feb
Peter Major

commented on OPENDJ-420

03 Feb

I've been trying to execute the searchrate command, but the command so far always failed on the client side with the following exception:

2012.02.03. 21:15:21 org.glassfish.grizzly.filterchain.DefaultFilterChain execute
WARNING: Exception during FilterChain execution
org.glassfish.grizzly.TransformationException: javax.net.ssl.SSLException: Received fatal alert: internal_error
at org.glassfish.grizzly.ssl.SSLDecoderTransformer.transformImpl(SSLDecoderTransformer.java:175)
at org.glassfish.grizzly.ssl.SSLDecoderTransformer.transformImpl(SSLDecoderTransformer.java:66)
at org.glassfish.grizzly.AbstractTransformer.transform(AbstractTransformer.java:73)
at org.glassfish.grizzly.filterchain.AbstractCodecFilter.handleRead(AbstractCodecFilter.java:71)
at org.glassfish.grizzly.ssl.SSLFilter.handleRead(SSLFilter.java:177)
at org.glassfish.grizzly.filterchain.ExecutorResolver$9.execute(ExecutorResolver.java:119)
at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeFilter(DefaultFilterChain.java:286)
at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeChainPart(DefaultFilterChain.java:223)
at org.glassfish.grizzly.filterchain.DefaultFilterChain.execute(DefaultFilterChain.java:155)
at org.glassfish.grizzly.filterchain.DefaultFilterChain.process(DefaultFilterChain.java:134)
at org.glassfish.grizzly.ProcessorExecutor.execute(ProcessorExecutor.java:78)
at org.glassfish.grizzly.nio.transport.TCPNIOTransport.fireIOEvent(TCPNIOTransport.java:829)
at org.glassfish.grizzly.strategies.AbstractIOStrategy.fireIOEvent(AbstractIOStrategy.java:103)
at org.glassfish.grizzly.strategies.SameThreadIOStrategy.executeIoEvent(SameThreadIOStrategy.java:96)
at org.glassfish.grizzly.nio.SelectorRunner.iterateKeyEvents(SelectorRunner.java:390)
at org.glassfish.grizzly.nio.SelectorRunner.iterateKeys(SelectorRunner.java:360)
at org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:326)
at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:262)
at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:508)
at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:488)
at java.lang.Thread.run(Thread.java:662)
Caused by: javax.net.ssl.SSLException: Received fatal alert: internal_error
at com.sun.net.ssl.internal.ssl.Alerts.getSSLException(Alerts.java:190)
at com.sun.net.ssl.internal.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1467)
at com.sun.net.ssl.internal.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1435)
at com.sun.net.ssl.internal.ssl.SSLEngineImpl.recvAlert(SSLEngineImpl.java:1601)
at com.sun.net.ssl.internal.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:1031)
at com.sun.net.ssl.internal.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:845)
at com.sun.net.ssl.internal.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:721)
at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:607)
at org.glassfish.grizzly.ssl.SSLDecoderTransformer.transformImpl(SSLDecoderTransformer.java:127)
... 20 more
Error occurred on one or more connections: Result(resultCode=Server Connection
Closed, matchedDN=, diagnosticMessage=, referrals=[], controls=[])

From the access logs I can actually see several successful requests, and also the following kind of errors from the same attempts:

[03/Feb/2012:21:19:57 +0100] DISCONNECT conn=72 reason="Protocol Error" msg="The client sent a request to the Directory Server that could not be properly decoded as an LDAP message: javax.net.ssl.SSLException: bad record MAC"
[03/Feb/2012:21:20:07 +0100] DISCONNECT conn=81 reason="Protocol Error" msg="The client sent a request to the Directory Server that could not be properly decoded as an LDAP message: javax.net.ssl.SSLException: Unsupported record version Unknown-29.91"

andi

committed 921 to openidm

03 Feb
OPENIDM-551 Backport up to r920 for: Initialize managed object level onStore, onRetrieve trigger scripts from configuration
andi

committed 920 to openidm

03 Feb
OPENIDM-551 Initialize managed object level onStore, onRetrieve trigger scripts from configuration
matthew

commented on OPENDJ-420

03 Feb

SSLEngine javadoc contains code illustrating how to deal with over/under flow. We don't do this currently.

Note that debug trace indicated that the buffers were regularly under-flowing, but the buffer sizes were not changing, so I don't think that we are hitting a bug here.

The bug looks like data is not being cleared from the network buffer between one unwrap and the next.

andiandi (deleted user)

created OPENIDM-552, OPENIDM-551

03 Feb
matthew

commented on OPENDJ-420

03 Feb

The exceptions are being generated in org.opends.server.extensions.TLSByteChannel.read(ByteBuffer). The code looks slightly dubious, for example it is not handling potential buffer size changes resulting from underflows and overflows.

Tcpdump output shows:

  • [C,S] connect + SSL handshake
  • [C] bind request
  • [S] bind response
  • [C] search request
  • [S] search response
  • [C] search request
  • ... more searches
  • [S] TCP RST/ACK

The client side traffic looks fine so this definitely looks like an issue in OpenDJ.

I've tried testing using OpenDJ SDK searchrate between a client and server on my laptop:

./bin/searchrate -p 1636 -w password -D cn=directory\ manager -Z -X -g "rand(0,2000)" -c 8 -t 4 -F -b ou=people,dc=example,dc=com "...big filter..." sunKeyValue sunxmlKeyValue

But I've not been able to trigger the issue. It looks like a non-loopback connection is important.

matthew

updated the Description of OPENDJ-420

03 Feb
Mark

resolved OPENAM-1092

03 Feb
matthew

created OPENDJ-420

03 Feb
Mark

committed 919 to openidm

03 Feb
Mention that the fixes and improvements list includes work done in 2.0.0, 2.0.1, and 2.0.2
Mark

committed 1642 to openam

03 Feb
Fix for OPENAM-1092: How to change the host name of an OpenAM instance
victor

created OPENAM-1092

03 Feb
matthew

created OPENDJ-419

03 Feb
Mark

committed 918 to openidm

03 Feb
Update fixes and known issues lists
mareks

changed the Assignee to 'Mareks Malnacs' on OPENAM-1088

03 Feb
jenojeno (deleted user)

created OPENAM-1088

02 Feb
matthew

commented on OPENDJ-373

02 Feb

Hi Matt,

This has been a fun exercise!

I checked your changes today and they look pretty much spot on. As you said, we just need to clean up the logging a bit and allocate an LDAP OID for the new config property. I'd also recommend changing the property name so that it begins with "preload" and more aligned with the preload timeout property, making them easier to identify in dsconfig.

However...

I tested it pretty extensively today and had some surprising initial results:

  1. like you, in nearly all of the tests that I performed I found that the old sequential "one DB at a time" approach was best, and that preloading the entire environment in a single "concurrent" pass was nearly always significantly slower
  2. if I tried to load the environment in a single pass (I set concurrency = 100 for this) the JVM slowed to a halt pretty quickly and spent 100% of its time GCing

In light of (2) above I had a look at the preload config and saw that it had an option to limit the memory available to the preload (default: unlimited). Setting this option to 1MB prevented the poor GC behavior since beforehand clearly the preload was using more memory than was available. This seemed to align the fully concurrent preload (concurrency = 100) performance with the sequential performance as well (concurrency = 1) - clearly GCs were causing the performance regression.

By setting the preload memory to 10MB I managed to get significant performance improvements for the case where the FS cache is empty and the DB is too big to fit in both the FS cache and the DB cache at the same time (DB of 5GB on a machine with 8GB RAM): the concurrent preload took around 200s and sequential took around 300s.

At this point I was becoming pleased with this new functionality, but then I tried to test it on a "mature" database. The previous tests had been performed on new, immature, databases which have been freshly imported, whose log files are 100% utilized, and whose key order also matches disk order. But this is not the common case: most DBs are mature and have had many updates made to them causing their log files to become less utilized as well as records to become "shuffled" on the disk.

To test a "mature" DB I ran a modrate test for several minutes causing each one of the 3M entries to be rewritten. This only impacted the id2entry DB since the changes did not touch indexed attributes. After the modrate the DB had grown from around 4GB on disk to around 7GB (~50% utilized).

Further attempts to preload the DB had disastrous results. Instead of preloads taking between 3-5 minutes, they were now taking over 25 minutes. I never bothered to wait for them to finish in fact. I tried concurrent, sequential, and even reverted your changes, but no improvement.

It looks like the preload is looping through the DB log files over and over again which seems kind of strange since I expect preload to load everything in a single pass. Tomorrow I'm going to see if this is a regression from JE4 to JE5. Either way, I think that this is a defect of some sort in JE: a preload should just do a single pass of the log in my opinion.

I'll chase it up with the JE folk once I have enough data. The current terrible performance for mature DBs, with or without your changes, makes this enhancement more or less pointless unfortunately.

I'll keep you posted on my JE4 based tests and any responses I get from the JE devs.

Matt

andiandi (deleted user)

created OPENIDM-550

02 Feb
andiandi (deleted user)

closed OPENIDM-549

02 Feb
Fixed in trunk and 2.0.2.
Ludovic Poitou

resolved OPENDJ-418

02 Feb

Committed in the trunk.