Alexa Skills Kit (ASK) SDK for Node.js V2: what's new, what's changed, what's gone

This was a surprise: today the Alexa team released V2 of the Node.js Alexa Skills Kit SDK. Unlike the new Actions on Google SDK V2, the new ASK SDK is not created to handle new request or response formats. Instead, the interface is the major focus of the new version, placing an emphasis on modularity (as pointed out by @nickschwab, @marktucker, and others) and extendability. If you’re used to the existing SDK, you’ll have to change your mental model to build skills. (Most code samples taken from the SDK wiki. That will have some details that aren’t here, and vice-versa.)

Note: thanks to the folks at Amazon, via Paul Cutsinger, for alerting me to a few things I missed or got wrong below. Corrections italicized or crossed out.

Moving to ASK SDK V2 from V1

First: you do not need to move to V2 if V1 is working for you, or if you want to build a new skill and don’t want to use the newest SDK. ~~As of right now, the ASK CLI doesn’t even use SDK V2.~~ (And, quickly, the ASK CLI was updated before I even published this post.) If you want to use the SDK that will have the newest functionality (e.g. notifications, gadgets), go ahead and switch to V2.

The easiest way to switch is by using the ask-sdk-v1adapter. The adapter works the same as before, and adds a registerV2Handlers method. Note that V1 handlers are used before any V2 handlers; if you have two that conflict, V1 will always take precedence.

'use strict';
const Alexa = require('ask-sdk-v1adapter');

exports.handler = function(event, context, callback) {
  const alexa = Alexa.handler(event, context);
  alexa.registerHandlers(handlers);
  alexa.registerV2Handlers(HelpIntentHandler);
  alexa.execute();
};

Changed in ASK SDK V2

Modularity

The biggest conceptual change is the new focus on modularity. I say this is the biggest change, because despite others like the new way to handle requests, a number of changes come as a result of the modularity. Data persistence via the PersistenceAdapter, service clients, and ApiClient all come about due to the focus on modularity. The best illustration of this is in the skill builders. These are two different ways to present handlers (either on Lambda or otherwise): the custom and the standard skill builders. With the standard skill builder, you’re using the ApiClient that comes from the Alexa team and the DynamoDB persistence adapter. The custom skill builder allows for an injection of an API client and persistence adapter of your choosing.

const skillBuilder = Alexa.SkillBuilders.custom();

exports.handler = skillBuilder
  .addRequestHandlers(
    LaunchRequestHandler,
    HelloWorldIntentHandler,
    HelpIntentHandler,
    CancelAndStopIntentHandler,
    SessionEndedRequestHandler
  )
  .addErrorHandlers(ErrorHandler)
  .lambda();

Handler definitions

Whoah, this is a big one, and, after the initial “I don’t like new things” reaction, I think I like the new way better. Reminder: in V1, handlers were tied to a state, and used the event emitter pattern by looking for all of the registered event listeners for an intent + state pairing. In V2, handlers are defined sequentially (that is to say, with a hierarchy) and each handler has a function that determines if it is valid for the current request.

const HelloWorldIntentHandler = {
  canHandle(handlerInput) {
    return handlerInput.requestEnvelope.request.type === 'IntentRequest'
      && handlerInput.requestEnvelope.request.intent.name === 'HelloWorldIntent';
  },
  handle(handlerInput) {
    const speechText = 'Hello World!';

    return handlerInput.responseBuilder
      .speak(speechText)
      .withSimpleCard('Hello World', speechText)
      .getResponse();
  }
};

In this snippet, canHandle will run first, and only if that returns true (in this case, the current request is for the HelloWorldIntent) will handle run. This opens handlers up to be able to handle more within a single function. For example, canHandle can check that the current state is IN_GAME and the request is either AMAZON.MoreIntent or AMAZON.ScrollDownIntent. In V1, you would need to declare both of these separately, perhaps calling the same function or emitting the “canonical” handler. Another example is handling AMAZON.StopIntent the same for all states. Because intent handling isn’t as closely tied to states as it was in V1, you can have this cascade where an intent is handled by a fallback handler unless it was handled higher in the hierarchy. The hierarchy is defined by the order in which the handlers are passed to addRequestHandlers.

exports.handler = skillBuilder
  .addRequestHandlers(
    LaunchRequest,
    HelpIntent,
    AnswerIntent,
    RepeatIntent,
    YesIntent,
    StopIntent,
    CancelIntent,
    NoIntent,
    SessionEndedRequest,
    UnhandledIntent
  )
  .addErrorHandlers(ErrorHandler)
  .lambda();

In this case, if a request matches HelpIntent and AnswerIntent, the HelpIntent handler will fulfill the request, because it comes first in the argument list. Again, do not downplay what this opens up. The code can route handlers based off intent, slot values, attributes, or even external API calls. Although… ignore that last one. These checks are sequential. @LaunchRequest@’s canHandle function is checked, then HelpIntent, then AnswerIntent, and on down the line. This processing time can add up.

The handlerInput argument is an object containing:

RequestEnvelope, the inbound request
AttributesManager, handles the attributes (stored data)
ServiceClientFactory, can connect to Alexa services (like requesting address)
ResponseBuilder
Context, the context argument to an AWS Lambda function

Response builder

Response builder has changed, with a new interface and no need to emit :responseReady but no other significant changes on first glance.

// V1
this.response.speak(speechOutput).listen(speechOutput);
this.emit(":responseReady");

// V2
return handlerInput.responseBuilder
		.speak(speechOutput)
		.reprompt(speechOutput)
		.getResponse();

Lambda handler

The exported lambda handler function is very different compared to V1, and abstracts away what is happening when the function is called.

// V1
exports.handler = function (event, context) {
    const alexa = Alexa.handler(event, context);
    alexa.appId = APP_ID;
    alexa.resources = languageString;
    alexa.registerHandlers(newSessionHandlers, startStateHandlers, triviaStateHandlers, helpStateHandlers);
    alexa.execute();
};

// V2
const skillBuilder = Alexa.SkillBuilders.custom();
exports.handler = skillBuilder
  .addRequestHandlers(
    LaunchRequest,
    HelpIntent,
    AnswerIntent,
    RepeatIntent,
    YesIntent,
    StopIntent,
    CancelIntent,
    NoIntent,
    SessionEndedRequest,
    UnhandledIntent
  )
  .addErrorHandlers(ErrorHandler)
  .lambda();

This new code makes it less clear that the incoming event and context are sent to the SDK and used to build the response. If you’re familiar with how Alexa works, it’s obvious, but not if you’re coming to it the first time. But also, most people coming to it the first time might overlook that part of the set up anyway. Interestingly, the SDK no longer uses the context.fail and context.succeed methods, and instead uses the callback (third argument to the function, which is not present in most skill code/examples you’ll find). This is likely related to an earlier issue with the SDK where attributes failed to write to DynamoDB, fixable by relying on the callback rather than the context methods.

There is another way of creating the Lambda handler that I glossed over. This way both makes it more clear that the handler is retrieving arguments, and makes it easier to log out the inbound event.

let skill;

exports.handler = async function (event, context) {
  console.log(`REQUEST++++${JSON.stringify(event)}`);
  if (!skill) {
    skill = Alexa.SkillBuilders.custom()
      .addRequestHandlers(
        LaunchRequestHandler,
        HelloWorldIntentHandler,
        HelpIntentHandler,
        CancelAndStopIntentHandler,
        SessionEndedRequestHandler,
      )
      .addErrorHandlers(ErrorHandler)
      .create();
  }

  return skill.invoke(event,context);
}

Attributes

Attributes now have three scopes: request, session, and persistent. Request attributes can be used to store temporary data or helper functions. Session attributes go from the skill, to the Alexa service, and back in the next request so long as the session is alive. Persistent attributes are sent to a data store (most commonly this will by DynamoDB) via the PersistenceAdapter. All attributes are handled through the AttributesManager and the methods for getting all three types of attributes, setting all three, and saving persistent attributes.

// V1
const gameQuestions = this.attributes.questions;

// V2, for session attributes
const gameQuestions = attributesManager.getSessionAttributes().questions;

// V2, for persisted attributes
return new Promise((resolve, reject) => {
  handlerInput.attributesManager.getPersistentAttributes()
    .then(attributes => resolve(attributes.questions))
    .catch(reject);
});

One huge implication is that V2 doesn’t have the concept of “state.” This rises out of how both attributes and intents are handled. Intents no longer flow from a top-level state, they “self select,” and attributes are now more granular. You can invoke a handler based on the value of an attribute that you call STATE, or you can choose any attribute or other criteria.

Data persistence

Another example where modularity shows its impacts: connecting a lambda skill to DynamoDB is no longer as simple as setting the table name. (I overspoke here: with the standard handler, you can create the table by providing the name and specifying that the table needs to be automatically created. Not as simple as before, but not far off, either.) Consequently, skills can now use other data stores. This, however, is only available when using CustomSkillBuilder.

const persistenceAdapter = {
  getAttributes (requestEnvelope) {
    // get attributes from data store
  },
  saveAttributes (requestEnvelope, attributes) {
    // save attributes to data store
  }
};

const skillBuilder = Alexa.SkillBuilders.custom();
exports.handler = skillBuilder
  .addRequestHandlers(
    LaunchRequest,
    HelpIntent,
    AnswerIntent,
    RepeatIntent,
    YesIntent,
    StopIntent,
    CancelIntent,
    NoIntent,
    SessionEndedRequest,
    UnhandledIntent
  )
  .addErrorHandlers(ErrorHandler)
  .withPersistenceAdapter(persistenceAdapter)
  .lambda();

The above is an example of using a custom persistence adapter. Maybe you want, instead, to use DyanmoDB. Compare how you would do it in V1 compared to V2.

// V1
exports.handler = function (event, context) {
    const alexa = Alexa.handler(event, context);
    alexa.appId = APP_ID;
    alexa.resources = languageString;
    alexa.dynamoDBTableName = `MySkill`;
    alexa.registerHandlers(newSessionHandlers, startStateHandlers, triviaStateHandlers, helpStateHandlers);
    alexa.execute();
};

// V2
const skillBuilder = Alexa.SkillBuilders.standard();
exports.handler = skillBuilder
  .addRequestHandlers(
    LaunchRequest,
    HelpIntent,
    AnswerIntent,
    RepeatIntent,
    YesIntent,
    StopIntent,
    CancelIntent,
    NoIntent,
    SessionEndedRequest,
    UnhandledIntent
  )
  .addErrorHandlers(ErrorHandler)
  .withTableName(`MySkill`)
  .withAutoCreateTable(true)
  .lambda();

Template helpers

Gone are the template builders, replaced with addRenderTemplateDirective, which accept objects of options.

// V1
let builder = new Alexa.templateBuilders.ListTemplate1Builder();
builder.setTitle(title);
builder.setToken('QUESTION');
builder.setListItems(itemList);

// V2
response.addRenderTemplateDirective({
  type : 'ListTemplate1',
  token : 'Question',
  backButton : 'hidden',
  backgroundImage,
  title,
  listItems : itemList,
});

New in ASK SDK V2

Error handlers

New in V2 is the idea of error handlers of the same stature as request handlers. Accepts the incoming handler details, and the current error. Registered with addErrorHandlers and “self-select” via the same canHandle function (again, returning a boolean).

const myErrorHandler = {
  canHandle(handlerInput, error) {
    return error.name.startsWith('AskSdk');
  },
  handle(handlerInput, error) {
    return handlerInput.responseBuilder
      .speak('An error was encountered while handling your request. Try again later')
      .getResponse();
  }
}

const skillBuilder = Alexa.SkillBuilders.standard();
exports.handler = skillBuilder
  .addRequestHandlers(LaunchRequest)
  .addErrorHandlers(myErrorHandler)
  .lambda();

Interceptors

These are functions that are called at certain parts in the lifecycle; specifically, for the request or the response. The request interceptor come into play before the selected request handler runs. The response interceptor runs right after that handler runs. Add them to the skill builder via addRequestInterceptors and addResponseInterceptors. The request interceptors are useful for adding data to the request attributes, while response interceptors can be used to validate data. Each returns a promise that resolves to nothing.

const interceptor = {
  process (handlerInput) {
    return new Promise((resolve, reject) => {
      // Set up request attributes or other logic.
      try {
        resolve();
      } catch (error) {
        reject(error);
      }
    });
  }
};

ApiClient

SDK V2 now has a built-in client for communicating with external APIs (when using the standard skill builder), or can accept a custom one if so desired. I won’t drill too far into this, but check out the implementation for yourself.

Also new, similar to ApiClient, are the Alexa Service Clients. These must be used in conjunction with either the default API client or a custom one, and are used to get lists, device address, or send directives (i.e. queueing audio).

Removed in ASK SDK V2

Localization

No more baked-in i18next. Good new if you’re not using it and you want to reduce the size of the skill you’re uploading to Lambda. One method to add this back into the skill is through a request interceptor. Check out this example from the Alexa team, where .t is hopping a ride with the request attributes, and is thus available in all handlers.

const LocalizationInterceptor = {
  process(handlerInput) {
    const localizationClient = i18n.use(sprintf).init({
      lng: handlerInput.requestEnvelope.request.locale,
      overloadTranslationOptionHandler: sprintf.overloadTranslationOptionHandler,
      resources: languageString,
      returnObjects: true
    });

    const attributes = handlerInput.attributesManager.getRequestAttributes();
    attributes.t = function (...args) {
      return localizationClient.t(...args);
    };
  },
};

Unhandled and NewSession handlers

The Unhandled and NewSession handlers are no more. Instead, rely on the handler hierarchy and canHandle function. For example, take the Unhandled handler. In V1, that was invoked when there was no handler for the request plus state combination. V2 places less of an emphasis on state, and more of an emphasis on the “self selection” of handlers. You can still have multiple handlers catching unhandled intents for a given state (though, again, “state” doesn’t really exist like it did in V1), but you can also have one singular unhandled intent. Make sure that it’s last in the argument list and that canHandle always returns true. That way, anything not otherwise handled will fall on through.

`emit` and `emitWithState`

Both gone. Now simply call the function you want to invoke.

APP ID Validation

~~Gone as well, this is now handled directly inside Lambda if you use it, or can be done in a request interceptor.~~

I got this one wrong. There is still ID validation, using the .withSkillId() method on the skill builders.

const skillBuilder = Alexa.SkillBuilders.standard();
exports.handler = skillBuilder
  .addRequestHandlers(LaunchRequest)
  .addErrorHandlers(myErrorHandler)
  .withSkillId(skillId)
  .lambda();

Overall, there are a large number of cosmetic (interface) changes with the Alexa Skills Kit for Node.js SDK V2, but there are a lot of functionality changes as well. Perhaps the most important are the modularity changes, reducing the size of the SDK, how handlers are selected, and request and response interceptors. Give it a try, or dip your toe in with the V1 adapter.