Node:chart_data, Next:, Previous:Unit, Up:Top



Reading CSV files and transforming data

The basic function of PyChart is to plot sample data in a variety of ways. Samples are simply a sequence of sequences, where sequence is a Python term that stands for either a tuple (comma-separated numbers or strings enclosed in parenthesis) or a list (comma-separated numbers or strings enclosed in square brackets). Data are given to plots through the "data" attribute:

l = line_plot.T(data=[(10,20), (11,38), (12,29)], xcol=0, ycol=1)

In the above example, three sample points will be drawn along with line segments that connect them: (10, 20) - (11, 38) - (12, 29). Attribute xcol tells the locations of X values within data (the first column of each sample in data), and ycol similarly tell the locations of Y values (the last column of each sample in data). A sample point can contain None, in which case it is ignored.

data = [(10, 20, 21), (11, 38, 22), (13, None, 15), (12, 29, 30)]
l1 = line_plot.T(data=data, xcol=0, ycol=1)
l2 = line_plot.T(data=data, xcol=0, ycol=2)

The above example is equivalent to:

l1 = line_plot.T(data=[(10, 20), (11, 38), (12, 29)], xcol=0, ycol=1)
l2 = line_plot.T(data=[(10, 21), (11, 22), (13, 15), (12, 30)], xcol=0, ycol=1)

Module chart_data provides several functions for generating, reading, or transforming samples.

chart_data.read_csv FILE, DELIM = ',' Function
This function reads comma-separated values from FILE. Empty lines and lines beginning with "#" are ignored. DELIM specifies how a line is separated into values. If it does not contain the letter "%", then DELIM marks the end of a value. Otherwise, this function acts like scanf in C:
chart_data.read_csv("file", "%d,%s:%d")

DELIM currently supports only three conversion format specifiers: "d"(int), "f"(double), and "s"(string).

chart_data.fread_csv FP, DELIM=',' Function
This function is similar to read_csv, except that it reads from an open file handle FP.
fp = open("foo", "r")
data = chart_data.fread_csv(fp, ",")

chart_data.read_str DELIM, LINES Function
This function is similar to read_csv, but it reads data from the list of LINES.
fp = open("foo", "r")
data = chart_data.read_str(",", fp.readlines())

chart_data.func F, FROM, TO, STEP Function
Create sample points from function F, which must be a single-parameter function that returns a number (e.g., math.sin). XMIN and XMAX specify the first and last X values, and STEP specifies the sampling interval.
sin_samples = chart_data.func(math.sin, 0, math.pi*4, 0.1)

chart_data.filter F, DATA Function
FUNC must be a single-argument function that takes a sequence (i.e., a sample point) and returns a boolean. This procedure calls FUNC on each element in DATA and returns a list comprising elements for which FUNC returns true.
>>> data = [[1,5], [2,10], [3,13], [4,16]]
... chart_data.filter(lambda x: x[1] % 2 == 0, data)
[[2,10], [4,16]].

chart_data.extract_rows DATA, ROWS... Function
Extract rows specified in the argument list.
>>> chart_data.extract_rows([[10,20], [30,40], [50,60]], 1, 2)
[[30,40],[50,60]]

chart_data.extract_columns DATA, COLS... Function
Extract columns specified in the argument list.
>>> chart_data.extract_columns([[10,20], [30,40], [50,60]], 0)
[[10],[30],[50]]

chart_data.transform FUNC, DATA Function
Apply FUNC on each element in DATA and return the list consisting of the return values from FUNC.
>>> data = [[10,20], [30,40], [50,60]]
... chart_data.transform(lambda x: [x[0], x[1]+1], data)
[[10, 21], [30, 41], [50, 61]]

chart_data.moving_average DATA, XCOL, YCOL, WIDTH Function
Compute the moving average of YCOL'th column of each sample point in DATA. In particular, for each element I in DATA, this function extracts up to WIDTH*2+1 elements, consisting of I itself, WIDTH elements before I, and WIDTH elements after I. It then computes the mean of the YCOL'th column of these elements, and it composes a two-element sample consisting of XCOL'th element and the mean.
>>> data = [[10,20], [20,30], [30,50], [40,70], [50,5]]
... chart_data.moving_average(data, 0, 1, 1)
[(10, 25.0), (20, 33.333333333333336), (30, 50.0), (40, 41.666666666666664), (50, 37.5)]


The above value actually represents:


[(10, (20+30)/2), (20, (20+30+50)/3), (30, (30+50+70)/3),
  (40, (50+70+5)/3), (50, (70+5)/2)]

chart_data.median DATA, COL=1 Function
Compute the median of the COL'th column of the values is DATA. For example, chart_data.median([(10,20), (20,4), (30,5)], 0) returns 20. chart_data.median([(10,20), (20,4), (30,5)], 1) returns 5.