Character Encoding in API Responses

August stores all user-facing strings in an HTML-escaped (safe) way. The goal is to prevent malicious inputs from users from affecting client applications.

Here is an extreme example of encoding a LockName like ©®`~|•^'#&;

curl -X GET -H "x-august-api-key: $APIKEY" -H "x-august-access-token: $AUGTOKEN" -H "content-type: application-json" https://api-production.august.com/users/locks/mine

{"ABC....123LOCK ID": {
        "HouseID": "def...-...-123HOUSE ID",
        "HouseName": "Andy's Desk Prod 2",
        "LockName": "©®\\`~|•^'#&;",
        "UserType": "superuser",
        "macAddress": "mm.aa.cc.aa.dd.rr"
    }
}

FAQ

Ok, so how should I unescape these strings? you might ask.

Android

On Android August uses unbescape for encoding and decoding.

http://www.unbescape.org/

unbescape is a Java library aimed at performing fully-featured and high-performance escape and unescape operations for:

  • HTML (HTML5 and HTML 4)
  • XML (XML 1.0 and XML 1.1)
  • JavaScript
  • JSON
  • URI / URL (both paths and query parameters)
  • CSS (both identifiers and string literals)
  • CSV (Comma-Separated Values)
  • Java literals
  • Java .properties files (both keys and values)

unbescape is Open Source Software, and it is distributed under the terms of the Apache License 2.0.

iOS

On iOS we use this to decode HTML entities (based on http://stackoverflow.com/a/1453142 ):

+ (NSString *)stringByDecodingXMLEntities:(NSString *)stringToDecode {
    if ((stringToDecode == nil) || (![stringToDecode.class isSubclassOfClass:[NSString class]])) {
        return nil;
    }
    NSUInteger myLength = [stringToDecode length];
    NSUInteger ampIndex = [stringToDecode rangeOfString:@"&" options:NSLiteralSearch].location;

    // Short-circuit if there are no ampersands.
    if (ampIndex == NSNotFound) {
        return stringToDecode;
    }
    // Make result string with some extra capacity.
    NSMutableString *result = [NSMutableString stringWithCapacity:(myLength * 1.25)];

    // First iteration doesn't need to scan to & since we did that already, but for code simplicity's sake we'll do it again with the scanner.
    NSScanner *scanner = [NSScanner scannerWithString:stringToDecode];

    [scanner setCharactersToBeSkipped:nil];

    NSCharacterSet *boundaryCharacterSet = [NSCharacterSet characterSetWithCharactersInString:@" \t\n\r;"];

    do {
        // Scan up to the next entity or the end of the string.
        NSString *nonEntityString;
        if ([scanner scanUpToString:@"&" intoString:&nonEntityString]) {
            [result appendString:nonEntityString];
        }
        if ([scanner isAtEnd]) {
            goto finish;
        }
        // Scan either a HTML or numeric character entity reference.
        if ([scanner scanString:@"&" intoString:NULL])
            [result appendString:@"&"];
        else if ([scanner scanString:@"'" intoString:NULL])
            [result appendString:@"'"];
        else if ([scanner scanString:@""" intoString:NULL])
            [result appendString:@"\""];
        else if ([scanner scanString:@"<" intoString:NULL])
            [result appendString:@"<"];
        else if ([scanner scanString:@"&gt;" intoString:NULL])
            [result appendString:@">"];
        else if ([scanner scanString:@"&#" intoString:NULL]) {
            BOOL gotNumber;
            unsigned charCode;
            NSString *xForHex = @"";

            // Is it hex or decimal?
            if ([scanner scanString:@"x" intoString:&xForHex]) {
                gotNumber = [scanner scanHexInt:&charCode];
            }
            else {
                gotNumber = [scanner scanInt:(int*)&charCode];
            }

            if (gotNumber) {
                [result appendFormat:@"%C", (unichar)charCode];

                [scanner scanString:@";" intoString:NULL];
            }
            else {
                NSString *unknownEntity = @"";

                [scanner scanUpToCharactersFromSet:boundaryCharacterSet intoString:&unknownEntity];


                [result appendFormat:@"&#%@%@", xForHex, unknownEntity];

                //[scanner scanUpToString:@";" intoString:&unknownEntity];
                //[result appendFormat:@"&#%@%@;", xForHex, unknownEntity];
                NSLog(@"Expected numeric character entity but got &#%@%@;", xForHex, unknownEntity);

            }

        }
        else {
            NSString *amp;

            [scanner scanString:@"&" intoString:&amp];  //an isolated & symbol
            [result appendString:amp];

            /*
             NSString *unknownEntity = @"";
             [scanner scanUpToString:@";" intoString:&unknownEntity];
             NSString *semicolon = @"";
             [scanner scanString:@";" intoString:&semicolon];
             [result appendFormat:@"%@%@", unknownEntity, semicolon];
             NSLog(@"Unsupported XML character entity %@%@", unknownEntity, semicolon);
             */
        }

    }
    while (![scanner isAtEnd]);

finish:
    return result;
}