Tuesday, September 21, 2010

Coherence: when the key is not the whole key

Yog-Sothoth is the key and guardian of the gate. Past, present, future, all are one in Yog-Sothoth. He knows where the Old Ones broke through of old, and where They shall break through again.
- "The Dunwich Horror", H.P. Lovecraft
Recently at work I was involved in implementing a distributed cache with Coherence to improve performance with a third party web service. Integrating the call to the web service with the cache was not difficult, however I did encounter a twist in that calls to the web service required information which, while tracked by that service, did not in any way affect the result we got back from the service. In other words, only part of the key was needed to correctly retrieve a value from the cache, however all of the key was needed to create the value to add to the cache.

My first thought was that this would be simple to deal with since all Coherence caches are ultimately just extensions of java.util.Map. If we defined the equals and hashCode methods using just the properties relevant to retrieving items from the cache we could have the best of both worlds: the data would be present when needed to call the external web service and the cache would not include redundant entries. However this did not work, and the particulars of why and what I did to fix things are the topics for this post.

Working code (flavored by the Mythos) will greatly help to illustrate the solution, so lets get started. Playing the part of client will be a Cultist who will be attempting to summon (i.e. make a request) of an Ancient One (who will be playing the part of the service). Our cache map will use the Cultist as a key and the result of the summoning (a String) as the value.

First up is the code for the Cultist.
Cultist
import java.io.Serializable;

public class Cultist implements Serializable{
  private static final long serialVersionUID = 5892594640095448056L;
  private String ancientOne;
  private String name;

  public String getAncientOne() {
    return ancientOne;
  }
  public void setAncientOne(String ancientOne) {
    this.ancientOne = ancientOne;
  }
  public String getName() {
    return name;
  }
  public void setName(String name) {
    this.name = name;
  }

  @Override
  public int hashCode() {
    return 31 + ((ancientOne == null) ? 0 : ancientOne.hashCode());
  }

  @Override
  public boolean equals(Object obj) {
    if (this == obj) return true;
    if (obj == null) return false;
    if (getClass() != obj.getClass()) return false;
    Cultist other = (Cultist) obj;
    if (ancientOne == null) {
      if (other.ancientOne != null) return false;
    }
    else if (!ancientOne.equals(other.ancientOne)) return false;
    return true;
  }

  @Override
  public String toString() {
    return "Caller: " + name + " -- Summoning: " + ancientOne;
  }
}
Notes
  • In order to reproduce the problem I encountered, the equals and hashCode methods only rely on the ancientOne field.
Next we need a class that processes the Cultist's summoning requests. Since this can happen anywhere, anytime, let's call this the TimeAndSpace class. This class has one method, processSummoning, which takes a Cultist as an argument and returns a String giving the result of the summoning.
TimeAndSpace
import java.util.HashMap;
import java.util.Map;

public class TimeAndSpace {
    private static final Map<String, String> responseMap;
    static {
      responseMap = new HashMap<String, String>();
      responseMap.put(KeyRitual.AO_ONE, "Sleeping");
      responseMap.put(KeyRitual.AO_TWO, "On vacation");
    }

    public static String processSummoning(Cultist cultist) {
      // While not pertinent to the result, the name is used ... in this case for output.
      System.out.println("TimeAndSpace: Processing summoning by " + cultist.getName() + " for " + cultist.getAncientOne());
      return responseMap.get(cultist.getAncientOne());
    }
}
Notes
  • In order to reproduce the problem I encountered, the name of the Cultist is used by this class but has no effect on the return value for a given summoning.
  • We track every time the processSummoning method is called for debugging purposes.
  • We will only work with two Ancient Ones, defined in a class below.

Next up is the FhtagnLoader class that will be used by Coherence to add entries to the cache.
FhtagnLoader
import com.tangosol.net.cache.AbstractCacheLoader;

public class FhtagnLoader extends AbstractCacheLoader {

  public FhtagnLoader() {
    super();
  }

  @Override
  public Object load(Object obj) {
    if (!(obj instanceof Cultist)) {
      return null;
    }
    return TimeAndSpace.processSummoning((Cultist) obj);
  }
}
Notes
  • All this does is delegate to TimeAndSpace to obtain the value to place in the cache.

The configuration file for the cache is in a file coherence-fhtagn.xml:
coherence-fhtagn.xml
<?xml version="1.0"?>
<!DOCTYPE cache-config SYSTEM "cache-config.dtd">
<cache-config>
  <caching-scheme-mapping>
    <cache-mapping>
      <cache-name>fhtagnCache</cache-name>
      <scheme-name>fhtagnCache</scheme-name>
    </cache-mapping>
  </caching-scheme-mapping>
  <caching-schemes>
   <distributed-scheme>
      <scheme-name>fhtagnCache</scheme-name>
      <service-name>fhtagnCache</service-name>
      <autostart>true</autostart>
      <backing-map-scheme>
        <read-write-backing-map-scheme>
          <internal-cache-scheme>
            <local-scheme>
            </local-scheme>
          </internal-cache-scheme>
          <cachestore-scheme>
            <class-scheme>
             <class-name>FhtagnLoader</class-name>
            </class-scheme>
          </cachestore-scheme>
          <read-only>true</read-only>
        </read-write-backing-map-scheme>
        <autostart>true</autostart>
      </backing-map-scheme>
    </distributed-scheme>
  </caching-schemes>
</cache-config>


Last, we have the KeyRitual class which we will run to demononstrate the use of the cache.
KeyRitual
import com.tangosol.net.CacheFactory;
import com.tangosol.net.NamedCache;

public class KeyRitual {
  public static final String AO_ONE = "Cthulhu";
  public static final String AO_TWO = "Yog-Sothoth";
  private static final String[] ANCIENT_ONES = {AO_ONE, AO_TWO};
  public static final String CACHE="fhtagnCache";

  public static void main(String... args) {
    NamedCache fhtagnCache = CacheFactory.getCache(CACHE);

    // list initial size
    ListCacheSizes(fhtagnCache);

    for (String ancientOne: ANCIENT_ONES) {
      // will make three calls
      for (int i = 0; i < 3; i++) {
        Cultist cultist = new Cultist();
        cultist.setAncientOne(ancientOne);
        cultist.setName("Cultist " + System.currentTimeMillis());
        // Track result of each call
        System.out.println("KeyRitual: Key [" + cultist + "] :: Value [" +  (String) fhtagnCache.get(cultist) + "]");
      }
    }

    // list ending size
    ListCacheSizes(fhtagnCache);
  }

  /** Helper method to print out size of caches in list **/
  static private void ListCacheSizes(NamedCache... caches) {
    for (NamedCache cache: caches) {
      System.out.println("Cache " + cache.getCacheName() + " has size: " + cache.size());
    }
  }
}
Notes
  • The ritual involves two distinct Ancient ones, Cthulhu and Yog-Sothoth, each of which will be summoned three times (each time by a different cultist).
  • We track the key/value pair for each summoning. Since equality for Cultist does not depend on the cultist name, we would expect to see two message logged from the TimeAndSpace class as well as a final cache size of two.

Placing all files are in the same folder as a copy of the coherence jar file, compiling and running gives:

codefhtagn: javac -cp coherence.jar *.java
codefhtagn: java -cp coherence.jar:. -Dtangosol.coherence.cacheconfig=coherence-fhtagn.xml KeyRitual
... lines omitted ... 
2010-09-20 13:19:09.021/4.404 Oracle Coherence GE 3.4.2/411 <D5> (thread=DistributedCache:fhtagnCache, member=1): Service fhtagnCache joined the cluster with senior service member 1
Cache fhtagnCache has size: 0
TimeAndSpace: Processing summoning by Cultist 1285010349086 for Cthulhu
KeyRitual: Key [Caller: Cultist 1285010349086 -- Summoning: Cthulhu] :: Value [Sleeping]
TimeAndSpace: Processing summoning by Cultist 1285010349097 for Cthulhu
KeyRitual: Key [Caller: Cultist 1285010349097 -- Summoning: Cthulhu] :: Value [Sleeping]
TimeAndSpace: Processing summoning by Cultist 1285010349098 for Cthulhu
KeyRitual: Key [Caller: Cultist 1285010349098 -- Summoning: Cthulhu] :: Value [Sleeping]
TimeAndSpace: Processing summoning by Cultist 1285010349099 for Yog-Sothoth
KeyRitual: Key [Caller: Cultist 1285010349099 -- Summoning: Yog-Sothoth] :: Value [On vacation]
KeyRitual: Key [Caller: Cultist 1285010349099 -- Summoning: Yog-Sothoth] :: Value [On vacation]
TimeAndSpace: Processing summoning by Cultist 1285010349100 for Yog-Sothoth
KeyRitual: Key [Caller: Cultist 1285010349100 -- Summoning: Yog-Sothoth] :: Value [On vacation]
Cache fhtagnCache has size: 5
2010-09-20 13:19:09.102/4.485 Oracle Coherence GE 3.4.2/411 <D4> (thread=ShutdownHook, member=1): ShutdownHook: stopping cluster node


Clearly things did not go as planned: there were five calls to TimeAndSpace as well as five entries in the cache. A closer look reveals that we have one entry per distinct cultist (we ended up with five cultists, not six, since our use of timestamp ended up giving two of them the same name - this could and did vary with other runs). Clearly our definition of equals and hashCode for Cultist is being ignored.

After doing some digging on the Web I found a very helpful blog entry: coherence-key-howto. To summarize: for distributed caches Coherence judges key identity based on the serialized form of the key (which turns out to be a Binary object) which in turn is based on all the non transient properties of the key. The actual equals and hashCode of the object (Cultist in our case) are ignored. The simplest fix would be to mark the name attribute of the Cultist as transient. However, since TimeAndSpace requires the entire Cultist key, this option is not available to us.

After some playing around, the solution I hit upon was to introduce a forwarding cache that sits in front of the original cache (which I'll now refer to as the backing cache). The forwarding cache intercepts all incoming get requests and performs the following logic:
  1. Given an incoming cultist key, create a new cultist key object using just the ancientOne property. Call this new key the relevant key.
  2. Check to see if there is an entry in the backing cache for the relevant key and then:
    • if there is a value in the backing cache for the relevant key, return the value.
    • if there is no value for the relevant key, then call TimeAndSpace using the incoming cultist key and add that result to the backing cache and then return that result.
Implementing this fix requires no changes to either the Cultist or TimeAndSpace classes. Note that FhtagnLoader is no longer needed since we will be loading things into the backing cache directly (more on this below). However we do need to modify coherence-fhtagn.xml as we have a new cache:
<caching-scheme-mapping>
    <cache-mapping>
      <cache-name>forwardingCache</cache-name>
      <scheme-name>forwardingCache</scheme-name>
    </cache-mapping>
    <cache-mapping>
      <cache-name>fhtagnCache</cache-name>
      <scheme-name>fhtagnCache</scheme-name>
    </cache-mapping>
  </caching-scheme-mapping>
  <caching-schemes>
    <distributed-scheme>
      <scheme-name>forwardingCache</scheme-name>
      <service-name>forwardingCache</service-name>
      <autostart>true</autostart>
      <backing-map-scheme>
        <read-write-backing-map-scheme>
          <internal-cache-scheme>
            <local-scheme>
              <class-name>ForwardingCache</class-name>
            </local-scheme>
          </internal-cache-scheme>
          <read-only>true</read-only>
        </read-write-backing-map-scheme>
        <autostart>true</autostart>
      </backing-map-scheme>
    </distributed-scheme>
   <distributed-scheme>
      <scheme-name>fhtagnCache</scheme-name>
      <service-name>fhtagnCache</service-name>
      <autostart>true</autostart>
      <backing-map-scheme>
        <read-write-backing-map-scheme>
          <internal-cache-scheme>
            <local-scheme>
            </local-scheme>
          </internal-cache-scheme>
          <read-only>false</read-only>
        </read-write-backing-map-scheme>
        <autostart>true</autostart>
      </backing-map-scheme>
    </distributed-scheme>
  </caching-schemes>
coherence-fhtagn.xml changes
  • The main change is the addition of a new cache region referencing the new class, ForwardingCache that we will be adding.
  • We've removed the loader for the backing cache since entries will be managed by the ForwardingCache. This necessitates making it a read/write cache as well.

There are a few changes to KeyRitual:
  public static final String FORWARDING_CACHE="forwardingCache";
  public static final String CACHE="fhtagnCache";

  public static void main(String... args) {
    NamedCache forwardingCache = CacheFactory.getCache(FORWARDING_CACHE);
    NamedCache fhtagnCache = CacheFactory.getCache(CACHE);

    // list initial sizes
    ListCacheSizes(forwardingCache, fhtagnCache);

    for (String ancientOne: ANCIENT_ONES) {
      // will make three calls
      for (int i = 0; i < 3; i++) {
        Cultist cultist = new Cultist();
        cultist.setAncientOne(ancientOne);
        cultist.setName("Cultist " + System.currentTimeMillis());
        // Track result of each call
        System.out.println("KeyRitual: Key [" + cultist + "] :: Value [" +  (String) forwardingCache.get(cultist) + "]");
      }
    }

    // list ending sizes
    ListCacheSizes(forwardingCache, fhtagnCache);
  }
KeyRitual changes
  • We track the size of each cache.
  • All get calls are to the forwarding cache.

Last, there is one new class, ForwardingCache that implements the logic described above.
ForwardingCache
import com.tangosol.net.CacheFactory;
import com.tangosol.net.NamedCache;
import com.tangosol.net.cache.CacheLoader;
import com.tangosol.net.cache.LocalCache;
import com.tangosol.util.Binary;
import com.tangosol.util.ExternalizableHelper;

public class ForwardingCache extends LocalCache {

  private static final long serialVersionUID = -2892660239113811271L;
  /**
   * These four constructors are required by Coherence
   */
  public ForwardingCache() {
    super();
  }
  public ForwardingCache(int cUnits) {
    super(cUnits);
  }
  public ForwardingCache(int cUnits, int cExpiryMillis) {
    super(cUnits, cExpiryMillis);
  }
  public ForwardingCache(int cUnits, int cExpiryMillis, CacheLoader loader) {
    super(cUnits, cExpiryMillis, loader);
  }

  /** Custom logic for dealing with keys **/
  @Override
  public Object get(Object key) {
    // if not Binary, then something has gone wrong!
    if(key instanceof Binary) {
      // Extract the original key: quick and dirty without check for
      // possible exceptions
      Cultist fullCultist = (Cultist) ExternalizableHelper.fromBinary((Binary) key);
      if (fullCultist == null) {
        return null;
      }

      // create relevant key without the additional meta info.
      Cultist limitedCultist = new Cultist();
      limitedCultist.setAncientOne(fullCultist.getAncientOne());

      // now utilize the backing cache
      NamedCache cache = CacheFactory.getCache(KeyRitual.CACHE);
      String result;
      if (!cache.containsKey(limitedCultist)) {
        result = TimeAndSpace.processSummoning(fullCultist);
        cache.put(limitedCultist, result);
      }
      else {
        result = (String) cache.get(limitedCultist);
      }
      // convert result back to Binary so it will work with distributed cache
      return ExternalizableHelper.toBinary(result);
    }
    else {
      throw new RuntimeException("Expecting binary object, but received " + key.getClass().getCanonicalName());
    }
  }
}
Notes
  • The four constructors are mandated by Coherence. These do nothing but delegate to the parent class.
  • Since we are in a distributed cache, incoming keys and return values must be Binary objects. Conversion to/from is easy to do using Coherence's ExternalizableHelper class
  • We obtain the reference to the backing cache via the Coherence CacheFactory class in order to not worry about managing key affinity: just let Coherence figure out which partition of the backing cache should deal with this key when we perform a put/get with it.

Compiling and running the code is the same as before:
codefhtagn: javac -cp coherence.jar *.java
codefhtagn: java -cp coherence.jar:. -Dtangosol.coherence.cacheconfig=coherence-fhtagn.xml KeyRitual
... lines omitted ...
2010-09-20 15:08:14.650/4.873 Oracle Coherence GE 3.4.2/411 <d5> (thread=DistributedCache:fhtagnCache, member=1): Service fhtagnCache joined the cluster with senior service member 1
Cache forwardingCache has size: 0
Cache fhtagnCache has size: 0
TimeAndSpace: Processing summoning by Cultist 1285016894699 for Cthulhu
KeyRitual: Key [Caller: Cultist 1285016894699 -- Summoning: Cthulhu] :: Value [Sleeping]
KeyRitual: Key [Caller: Cultist 1285016894713 -- Summoning: Cthulhu] :: Value [Sleeping]
KeyRitual: Key [Caller: Cultist 1285016894715 -- Summoning: Cthulhu] :: Value [Sleeping]
TimeAndSpace: Processing summoning by Cultist 1285016894717 for Yog-Sothoth
KeyRitual: Key [Caller: Cultist 1285016894717 -- Summoning: Yog-Sothoth] :: Value [On vacation]
KeyRitual: Key [Caller: Cultist 1285016894719 -- Summoning: Yog-Sothoth] :: Value [On vacation]
KeyRitual: Key [Caller: Cultist 1285016894720 -- Summoning: Yog-Sothoth] :: Value [On vacation]
Cache forwardingCache has size: 0
Cache fhtagnCache has size: 2
2010-09-20 15:08:14.724/4.947 Oracle Coherence GE 3.4.2/411 <d4> (thread=ShutdownHook, member=1): ShutdownHook: stopping cluster node

Now we have achieved what we wanted: there are only two entries in the cache, one for each Ancient One being summoned.

As a final note, you may be wondering why it is necessary to have a separate backing cache? Since we are not putting anything into the forwarding cache, after we extract the relevant key from the incoming key, can't we just do the put/get for the relevant keys using the forwarding cache itself? The unfortunate answer is no, this is not possible since Coherence does not allow a cache to call into itself (see here for more info)

To demonstrate this, suppose we make the following change to ForwardingCache:
      // switch to use same cache
      NamedCache cache = CacheFactory.getCache(KeyRitual.FORWARDING_CACHE);
      String result;
      if (!cache.containsKey(limitedCultist)) {


Running this gives the following result:
codefhtagn: javac -cp coherence.jar *.java
codefhtagn: java -cp coherence.jar:. -Dtangosol.coherence.cacheconfig=coherence-fhtagn.xml KeyRitual
... lines omitted ...
Cache forwardingCache has size: 0
Cache fhtagnCache has size: 0
2010-09-20 15:11:14.965/4.913 Oracle Coherence GE 3.4.2/411 <error> (thread=DistributedCache:forwardingCache, member=1): Assertion failed: poll() is a blocking call and cannot be called on the Service thread
 at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.poll(Grid.CDB:4)
 at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.poll(Grid.CDB:11)
 at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$BinaryMap.containsKey(DistributedCache.CDB:24)
 at com.tangosol.util.ConverterCollections$ConverterMap.containsKey(ConverterCollections.java:1494)
 at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.DistributedCache$ViewMap.containsKey(DistributedCache.CDB:1)
 at com.tangosol.coherence.component.util.SafeNamedCache.containsKey(SafeNamedCache.CDB:1)
 at ForwardingCache.get(ForwardingCache.java:49)
... lines omitted ...


So, to summarize:
  • If you are using a distributed cache with Coherence, be aware that all non-transient fields will be used to determine key uniqueness.
  • If you are forced to add additional fields to a key which are not part of the uniqueness when mapping a key to a value in a cache entry, then consider using a forwarding cache to strip out those unnecessary attributes before adding/retrieving values with a separate backing cache.

No comments:

Post a Comment