LatentSemanticMapping.h

Includes:
<CoreFoundation/CoreFoundation.h>
<CoreServices/CoreServices.h>
<Carbon/Carbon.h>
<stdio.h>
<stdint.h>

Overview

Semantic Mapping This framework classifies texts based on latent semantic information.



Functions

LSMMapAddCategory
LSMMapAddText
LSMMapAddTextWithWeight
LSMMapApplyClusters
LSMMapCompile
LSMMapCreate
LSMMapCreateClusters
LSMMapCreateFromURL
LSMMapGetCategoryCount
LSMMapGetProperties
LSMMapGetTypeID
LSMMapSetProperties
LSMMapSetStopWords
LSMMapStartTraining
LSMMapWriteToStream
LSMMapWriteToURL
LSMResultCopyToken
LSMResultCopyTokenCluster
LSMResultCopyWord
LSMResultCopyWordCluster
LSMResultCreate
LSMResultGetCategory
LSMResultGetCount
LSMResultGetScore
LSMResultGetTypeID
LSMTextAddToken
LSMTextAddWord
LSMTextAddWords
LSMTextCreate
LSMTextGetTypeID

LSMMapAddCategory


LSMCategory LSMMapAddCategory(
    LSMMapRef mapref);  
Discussion

Adds another category and returns its category identifier.


LSMMapAddText


OSStatus LSMMapAddText(
    LSMMapRef mapref,
    LSMTextRef textref,
    LSMCategory category);  
Discussion

Adds a training text to the given category. The textref is no longer needed after this call.


LSMMapAddTextWithWeight


OSStatus LSMMapAddTextWithWeight(
    LSMMapRef mapref,
    LSMTextRef textref,
    LSMCategory category,
    float weight);  
Discussion

Adds a training text to the given category with a weight different from 1. The weight may be negative, but global counts will be pinned to 0. The textref is no longer needed after this call.


LSMMapApplyClusters


OSStatus LSMMapApplyClusters(
    LSMMapRef mapref,
    CFArrayRef clusters);  
Discussion

Group categories or words (tokens) into the specified sets of clusters.


LSMMapCompile


OSStatus LSMMapCompile(
    LSMMapRef mapref);  
Discussion

Compiles the map into executable form and puts it into mapping mode, preparing it for the classification of texts. This function is computationally expensive.


LSMMapCreate


LSMMapRef LSMMapCreate(
    CFAllocatorRef alloc,
    CFOptionFlags flags);  
Discussion

Creates a new LSM map. Call CFRelease to dispose.


LSMMapCreateClusters


CFArrayRef LSMMapCreateClusters(
    CFAllocatorRef alloc, 
    LSMMapRef mapref,
    CFArrayRef subset, 
    CFIndex numClusters,
    CFOptionFlags flags);  
Discussion

Compute a set of clusters grouping similar categories or words. If subset is non-NULL, only perform clustering on the categories or words listed.


LSMMapCreateFromURL


LSMMapRef LSMMapCreateFromURL(
    CFAllocatorRef alloc,
    CFURLRef file, 
    CFOptionFlags flags);  
Discussion

Loads a map from a given file.


LSMMapGetCategoryCount


CFIndex LSMMapGetCategoryCount(
    LSMMapRef mapref);  
Discussion

Returns the number of categories in the map.


LSMMapGetProperties


CFDictionaryRef LSMMapGetProperties(
    LSMMapRef mapref);  
Discussion

Get a dictionary of properties for the map. LSM retains ownership of this dictionary, do not release it.


LSMMapGetTypeID


CFTypeID LSMMapGetTypeID(
    void);  
Discussion

Returns the Core Foundation type identifier for LSM maps.


LSMMapSetProperties


void LSMMapSetProperties(
    LSMMapRef mapref,
    CFDictionaryRef properties);  
Discussion

Set a dictionary of properties for the map. LSM makes its own copy of the properties, there's no need to retain them past this call.


LSMMapSetStopWords


OSStatus LSMMapSetStopWords(
    LSMMapRef mapref,
    LSMTextRef textref);  
Discussion

The specified words will be omitted from all classification efforts. Needs to be called before any other texts are created. The textref is no longer needed after this call.


LSMMapStartTraining


OSStatus LSMMapStartTraining(
    LSMMapRef mapref);  
Discussion

Puts the map into training mode, preparing it for the addition of more categories and/or texts. This function will be somewhat expensive, as it requires substantial data structure reorganization.


LSMMapWriteToStream


OSStatus LSMMapWriteToStream(
    LSMMapRef mapref,
    LSMTextRef textref, 
    CFWriteStreamRef stream,
    CFOptionFlags options);  
Discussion

Writes information about a map and/or text to a stream in text form


LSMMapWriteToURL


OSStatus LSMMapWriteToURL(
    LSMMapRef mapref,
    CFURLRef file,
    CFOptionFlags flags);  
Discussion

Compiles the map if necessary and then stores it into the given file.


LSMResultCopyToken


CFDataRef LSMResultCopyToken(
    LSMResultRef mapref,
    CFIndex n);  
Discussion

Returns the token for the n-th best (zero based) result.


LSMResultCopyTokenCluster


CFArrayRef LSMResultCopyTokenCluster(
    LSMResultRef mapref,
    CFIndex n);  
Discussion

Returns the cluster of tokens for the n-th best (zero based) result.


LSMResultCopyWord


CFStringRef LSMResultCopyWord(
    LSMResultRef result,
    CFIndex n);  
Discussion

Returns the word for the n-th best (zero based) result.


LSMResultCopyWordCluster


CFArrayRef LSMResultCopyWordCluster(
    LSMResultRef result,
    CFIndex n);  
Discussion

Returns the cluster of words for the n-th best (zero based) result.


LSMResultCreate


LSMResultRef LSMResultCreate(
    CFAllocatorRef alloc, 
    LSMMapRef mapref,
    LSMTextRef textref, 
    CFIndex numResults,
    CFOptionFlags flags);  
Discussion

Returns, in decreasing order of likelihood, the categories or words that best match when a text is mapped into a map.


LSMResultGetCategory


LSMCategory LSMResultGetCategory(
    LSMResultRef result,
    CFIndex n);  
Discussion

Returns the category of the n-th best (zero based) result.


LSMResultGetCount


CFIndex LSMResultGetCount(
    LSMResultRef result);  
Discussion

Returns the number of results.


LSMResultGetScore


float LSMResultGetScore(
    LSMResultRef result,
    CFIndex n);  
Discussion

Returns the likelihood of the n-th best (zero based) result.


LSMResultGetTypeID


CFTypeID LSMResultGetTypeID(
    void);  
Discussion

Returns the Core Foundation type identifier for LSM results.


LSMTextAddToken


OSStatus LSMTextAddToken(
    LSMTextRef textref,
    CFDataRef token);  
Discussion

Adds an arbitrary binary token to the text. The order of tokens is significant if the map uses pairs or triplets, and the count of tokens is always significant.


LSMTextAddWord


OSStatus LSMTextAddWord(
    LSMTextRef textref,
    CFStringRef word);  
Discussion

Adds a word to the text. The order of words is significant if the map uses pairs or triplets, and the count of words is always significant.


LSMTextAddWords


OSStatus LSMTextAddWords(
    LSMTextRef textref,
    CFStringRef words, 
    CFLocaleRef locale,
    CFOptionFlags flags);  
Discussion

Breaks a string into words using the locale provided and adds the words to the text.


LSMTextCreate


LSMTextRef LSMTextCreate(
    CFAllocatorRef alloc,
    LSMMapRef mapref);  
Discussion

Creates a new text.


LSMTextGetTypeID


CFTypeID LSMTextGetTypeID(
    void);  
Discussion

Returns the Core Foundation type identifier for LSM texts.

Typedefs

LSMCategory
LSMMapRef
LSMResult
LSMResultRef
LSMTextRef

LSMCategory


typedef uint32_t LSMCategory;  
Discussion

An integral type representing a category.


LSMMapRef


typedef struct __LSMMap * LSMMapRef;  
Discussion

An opaque Core Foundation type representing an LSM map (mutable).


LSMResult


See Also:

LSMResultRef

typedef struct __LSMResult * LSMResultRef;  
Discussion

An opaque Core Foundation type representing the result of a lookup (immutable).


LSMResultRef


See Also:

LSMResult

typedef struct __LSMResult * LSMResultRef;  
Discussion

An opaque Core Foundation type representing the result of a lookup (immutable).


LSMTextRef


typedef struct __LSMText * LSMTextRef;  
Discussion

An opaque Core Foundation type representing an input text (mutable).

Enumerated Types

Error codes
Map Flags
Parsing Flags
Result Flags
Storage Flags

Error codes


enum { 
    kLSMMapOutOfState = -6640, 
    kLSMMapNoSuchCategory = -6641, 
    kLSMMapWriteError = -6642, 
    kLSMMapBadPath = -6643, 
    kLSMMapBadCluster = -6644 
};  
Constants
kLSMMapOutOfState

This call cannot be issued in this map state

kLSMMapNoSuchCategory

Invalid category specified

kLSMMapWriteError

An error occurred writing the map

kLSMMapBadPath

The URL you specified does not exist

kLSMMapBadCluster

The clusters you specified are invalid

Discussion

Errors returned from LSM routines


Map Flags


enum { 
    kLSMMapPairs = 1, 
    kLSMMapTriplets = 2,  
    kLSMMapHashText = 256 
};  
Constants
kLSMMapPairs

Use pairs in addition to single words.

kLSMMapTriplets

Use triplets and pairs in addition to single words.

kLSMMapHashText

Transform the text so it's not trivially human readable. Disables creation of language models.

Discussion

Options that can be specified for LSMMapCreate. These options can improve mapping accuracy, at a potentially significant increase in memory use.


Parsing Flags


enum { 
    kLSMTextPreserveCase = 1, 
    kLSMTextPreserveAcronyms = 2, 
    kLSMTextApplySpamHeuristics = 4 
};  
Constants
kLSMTextPreserveAcronyms

Don't map all uppercase words to lowercase.

kLSMTextPreserveCase

Don't change any words to lowercase.

kLSMTextApplySpamHeuristics

Try to find words in hostile text.

Discussion

Options you can specify for LSMTextAddWords.


Result Flags


enum { 
    kLSMResultBestWords = 1, 
};  
Constants
kLSMResultBestWords

Find the words, rather than categories, that best match.

Discussion

Options for LSMResultCreate.


Storage Flags


enum { 
    kLSMMapDiscardCounts = 1, 
    kLSMMapLoadMutable = 2 
};  
Constants
kLSMMapDiscardCounts

Don't keep counts. If specified on loading, the map needs to be reloaded without this option before calling LSMStartTraining. If specified on storing, the stored map can't be retrained at all. This option can save a lot of memory and/or disk space.

kLSMMapLoadMutable

Load map as mutable in training state.

kLSMMapHashText

(Defined above) If specified on storing, will hash the map if it hasn't been hashed yet.

Discussion

Options for LSMMap{CreateFrom,WriteTo}URL.

Macro Definitions

kLSMAlgorithmDense
kLSMAlgorithmKey
kLSMAlgorithmSparse
kLSMDimensionKey
kLSMIterationsKey
kLSMPrecisionDouble
kLSMPrecisionFloat
kLSMPrecisionKey
kLSMSweepAgeKey
kLSMSweepCutoffKey

kLSMAlgorithmDense


See Also:

kLSMAlgorithmKey

kLSMAlgorithmSparse

kLSMPrecisionKey

kLSMPrecisionFloat

kLSMPrecisionDouble

kLSMDimensionKey

kLSMIterationsKey

kLSMSweepAgeKey

kLSMSweepCutoffKey

#define kLSMAlgorithmDense CFSTR("LSMAlgorithmDense") 
Discussion

A CFDictionary of arbitrary properties may be associated. with an LSM map. The following keys currently are interpreted by LSM, and all other keys starting with LSM... are reserved.


kLSMAlgorithmKey


See Also:

kLSMAlgorithmDense

kLSMAlgorithmSparse

kLSMPrecisionKey

kLSMPrecisionFloat

kLSMPrecisionDouble

kLSMDimensionKey

kLSMIterationsKey

kLSMSweepAgeKey

kLSMSweepCutoffKey

#define kLSMAlgorithmKey CFSTR("LSMAlgorithm") 
Discussion

A CFDictionary of arbitrary properties may be associated. with an LSM map. The following keys currently are interpreted by LSM, and all other keys starting with LSM... are reserved.


kLSMAlgorithmSparse


See Also:

kLSMAlgorithmKey

kLSMAlgorithmDense

kLSMPrecisionKey

kLSMPrecisionFloat

kLSMPrecisionDouble

kLSMDimensionKey

kLSMIterationsKey

kLSMSweepAgeKey

kLSMSweepCutoffKey

#define kLSMAlgorithmSparse CFSTR("LSMAlgorithmSparse") 
Discussion

A CFDictionary of arbitrary properties may be associated. with an LSM map. The following keys currently are interpreted by LSM, and all other keys starting with LSM... are reserved.


kLSMDimensionKey


See Also:

kLSMAlgorithmKey

kLSMAlgorithmDense

kLSMAlgorithmSparse

kLSMPrecisionKey

kLSMPrecisionFloat

kLSMPrecisionDouble

kLSMIterationsKey

kLSMSweepAgeKey

kLSMSweepCutoffKey

#define kLSMDimensionKey CFSTR("LSMDimension") 
Discussion

A CFDictionary of arbitrary properties may be associated. with an LSM map. The following keys currently are interpreted by LSM, and all other keys starting with LSM... are reserved.


kLSMIterationsKey


See Also:

kLSMAlgorithmKey

kLSMAlgorithmDense

kLSMAlgorithmSparse

kLSMPrecisionKey

kLSMPrecisionFloat

kLSMPrecisionDouble

kLSMDimensionKey

kLSMSweepAgeKey

kLSMSweepCutoffKey

#define kLSMIterationsKey CFSTR("LSMIterations") 
Discussion

A CFDictionary of arbitrary properties may be associated. with an LSM map. The following keys currently are interpreted by LSM, and all other keys starting with LSM... are reserved.


kLSMPrecisionDouble


See Also:

kLSMAlgorithmKey

kLSMAlgorithmDense

kLSMAlgorithmSparse

kLSMPrecisionKey

kLSMPrecisionFloat

kLSMDimensionKey

kLSMIterationsKey

kLSMSweepAgeKey

kLSMSweepCutoffKey

#define kLSMPrecisionDouble CFSTR("LSMPrecisionDouble") 
Discussion

A CFDictionary of arbitrary properties may be associated. with an LSM map. The following keys currently are interpreted by LSM, and all other keys starting with LSM... are reserved.


kLSMPrecisionFloat


See Also:

kLSMAlgorithmKey

kLSMAlgorithmDense

kLSMAlgorithmSparse

kLSMPrecisionKey

kLSMPrecisionDouble

kLSMDimensionKey

kLSMIterationsKey

kLSMSweepAgeKey

kLSMSweepCutoffKey

#define kLSMPrecisionFloat CFSTR("LSMPrecisionFloat") 
Discussion

A CFDictionary of arbitrary properties may be associated. with an LSM map. The following keys currently are interpreted by LSM, and all other keys starting with LSM... are reserved.


kLSMPrecisionKey


See Also:

kLSMAlgorithmKey

kLSMAlgorithmDense

kLSMAlgorithmSparse

kLSMPrecisionFloat

kLSMPrecisionDouble

kLSMDimensionKey

kLSMIterationsKey

kLSMSweepAgeKey

kLSMSweepCutoffKey

#define kLSMPrecisionKey CFSTR("LSMPrecision") 
Discussion

A CFDictionary of arbitrary properties may be associated. with an LSM map. The following keys currently are interpreted by LSM, and all other keys starting with LSM... are reserved.


kLSMSweepAgeKey


See Also:

kLSMAlgorithmKey

kLSMAlgorithmDense

kLSMAlgorithmSparse

kLSMPrecisionKey

kLSMPrecisionFloat

kLSMPrecisionDouble

kLSMDimensionKey

kLSMIterationsKey

kLSMSweepCutoffKey

#define kLSMSweepAgeKey CFSTR("LSMSweepAge") 
Discussion

A CFDictionary of arbitrary properties may be associated. with an LSM map. The following keys currently are interpreted by LSM, and all other keys starting with LSM... are reserved.


kLSMSweepCutoffKey


See Also:

kLSMAlgorithmKey

kLSMAlgorithmDense

kLSMAlgorithmSparse

kLSMPrecisionKey

kLSMPrecisionFloat

kLSMPrecisionDouble

kLSMDimensionKey

kLSMIterationsKey

kLSMSweepAgeKey

#define kLSMSweepCutoffKey CFSTR("LSMSweepCutoff") 
Discussion

A CFDictionary of arbitrary properties may be associated. with an LSM map. The following keys currently are interpreted by LSM, and all other keys starting with LSM... are reserved.

 

Last Updated: 2009-04-17