Welcome to the SAX (Symbolic Aggregate approXimation) Homepage!
SAX was invented by Eamonn Keogh and Jessica Lin in 2002, using funding from NSF Career Award 0237918. Edward Tufte was kind enough to mention that SAX allows a sparkline like visualization of data. The relevant paper is this one [pdf]. Li Wei has generalized the SAX code to handle the N/n not equal an integer case, and to allow alphabet sizes up to 20. Download this zip file for the code and details. If you want a copy of my SAX time series/Shape tutorial, download this. Here is a video of Dr. Keogh giving a talk at Google about using SAX for various problems, including shape mining. Much of the utility of SAX has now been subsumed by iSAX , which is a generalization of SAX that allows indexing and mining of massive datasets. Visit the iSAX page.
Papers by Keogh and collaborators that use SAX. (in random order)
In [1] we show how to use SAX to find time series discords which are unusual time series. In [2] we consider a special case of SAX, which has an alphabet size of 2, and a word size equal to the raw data, and show that we can use this bit-level representation for a variety of data mining tasks. In [3] we show how to use SAX to create time series bitmaps, which allow visualization of time series data directly within a standard GUI such as MS Windows. In [4] we further show how to use time series bitmaps to do anomaly detection. In [5] we show that SAX can support parameter-lite data mining of time series, including classification and clustering. In [7] we show that SAX can replace standard representations of time series (i.e DWT, DFT) for all classic data mining problems including classification, clustering and indexing. We first used SAX to find time series motifs (exactly, and somewhat fast) in [9], and later showed a blinding fast probabilistic algorithm in [8]. In [10] we tentatively showed how to use SAX to meaningfully cluster time series streams. In [12] we show an application of SAX to a shape mining problem, and in [11] we generalize the time series bitmap concept to more general datasets. In [13] we show how to use SAX to find approximately duplicated shapes (shape motifs) in large databases. Paper [14] is a journal paper reviewing SAX first two years. Paper [15] shows how to find motifs under uniform scaling. Paper [16] introduces iSAX. Paper [17] shows how to do SAX on resource limited sensors.
