Archive for category MATLAB

Matlab Toolbox: Fuzzy Clustering and Data Analysis Toolbox

I googled “fuzzy clustering and matlab” and wanted to find some tools for data clustering today. I came across a very good MATLAB toolbox at In the web page, you can find source code and a detailed documentation that describes several popular clustering methods and validation methods, too. The Toolbox is a collection of MATLAB functions. The toolbox provides four categories of functions: 1) Clustering algorithms, 2) Evaluation with cluster prototypes, 3) Validation and 4) Visualization (See the following figure). In addition, it also provides some real examples that you can use them as start point for your project. To learn more you can visit author’s website or download source code and documentation here.


MATLAB Source code –
Toolbox documentation – FuzzyClusteringToolbox


Tags: , , ,

Graph Digitization in MATLAB

Sometimes we need digitize graphs from scientific papers because we can not get the original data. The digitized data can be used to fit model or redraw graphs. How can we achieve the goal? At the beginning I think we can do that by reading the image data and classifying the data by a given threshhold. Then put the data to spreadsheet (EXCEL) and do further precessing. After I tried, I found it was difficult to do it in EXCEL.

I turned to MATLAB. I found there are several general functions in MATLAB that can handle this problem very easily. By using imread function to read the image data to memory, and image function to show the image, then ginput to get data points in the graphs. After all data point is digitized, use linear regression method to transfer the x and y coordiate values to actual values with the units used in the graph. Then show the data again and save them to text files.

The following code is a complete m function. You can modify it to fit your problem. The arguments of the function are:

  1. imgfilename : the filename of image file, must in the same directory as the m file is;
  2. xp : the actual values of X axis, a row vector;
  3. yp : the actual values of Y axis, a row vector;
  4. line1 : the name of the first line, a string;
  5. line2 : the name of the second line, a string;

The results will be saved to two seperate text files, one file per line. The first column is X value, the second column is Y values. In my case, X is time and Y is gene expression level.

function digitize(imgfilename,xp,yp,line1,line2)
a = imread([imgfilename '.bmp']);
A1=[ones(size(TIME)) TIME];
A2=[ones(size(TOC1)) TOC1];
XLerTime=[ones(size(XLer)) XLer]*C1;
YLerToc1=[ones(size(YLer)) YLer]*C2;
XlhyTime=[ones(size(Xlhy)) Xlhy]*C1;
YlhyToc1=[ones(size(Ylhy)) Ylhy]*C2;
hold on;
DATA1=[XLerTime YLerToc1];
DATA2=[XlhyTime YlhyToc1];
save([imgfilename line1 '.txt'],'DATA1','-ascii');
save([imgfilename line2 '.txt'],'DATA2','-ascii');

After the simple version was finished. I told my major professor and he suggest me to do some statistical tests for the digitizing processing. I rewrote the program. The source code is listed below. In this version, argument ‘lines’ has changed to a cell array, it can accommodate as many lines as possible according to your specific graphs.

function digitize(imgfilename,xp,yp,lines)
% This function can digitize as many lines as you indicate in 'lines'
% imgfilename - the filename of image file, a string
% xp - the actual data value of data points in X axis, a row vector
% yp - the actual data value of data points in Y axis, a row vector
% lines - the string name of lines, a cell array
a = imread([imgfilename '.bmp']);
% number of data points in X axis
% number of data points in Y axis
% number of lines
nline = length(lines);
% number of times to digitize axes
naxes = 3;

Y = []; % all Ys put into one vector
YY = []; % all Ys put into a matrix
% get the data points in Y axis for several times
for i=1:naxes
    Y = [Y;y];
    YY = [YY y];
% pool the data points from three repetition together
A=[ones(size(Y)) Y];
% linear regression model for Y axis
yps = repmat(yp,1,naxes);
% calculate the earror value
ym = mean(YY,2); % mean for each data point
for i=1:size(YY,2), YY(:,i)=YY(:,i)-ym;, end
yy = reshape(YY,1,size(YY,1)*size(YY,2));
yse = std(yy)
yse = CY(2) * yse

X = []; % all Ys put into one vector
XX = []; % all Ys put into a matrix
% get the data points in X axis for several times
for i=1:naxes
    X = [X;x];
    XX = [XX x];
% pool the data points from three repetition together
A=[ones(size(X)) X];
% linear regression model for Y axis
xps = repmat(xp,1,naxes);
% calculate the earror value
xm = mean(XX,2); % mean for each data point
for i=1:size(XX,2), XX(:,i)=XX(:,i)-xm;, end
xx = reshape(XX,1,size(XX,1)*size(XX,2));
xse = std(xx)
xse = CX(2) * xse

% get the data points for all lines
% after digitization of one line, press return key to advance to next line
XLine = cells(nline,1);
YLine = cells(nline,1);
for i = 1:nline
    XLine{i}=[ones(size(xLine)) xLine]*CX;
    YLine{i}=[ones(size(yLine)) YLine]*CY;
for i = 1:nline
    hold on;
    DATA=[XLine{i} YLine{i}];
    save([imgfilename lines{i} '.txt'],'DATA','-ascii');

This version really aspired me. It is not easy to use it because if you input some thing wrong. You need stop the program and run it again until you did everything right. MATLAB provide a tool, GUIDE, to make graphic user interface. Just by trying, I started to make a UI program to digitize the graph. And I succeeded. See the following screenshot.

In the GUI version, users can do the digitization one by one. The order is not very important. You can digitize Y axis, ten X axis. After that test the perpendicular property of two axes. Then digitize lines one by one as long as you give a different name for a line. The program can save the digitized data to different files. You do not need to do it no break. You can do it pierce by pierce as long as the window is not closed. This version give user the maximum flexibility to finish the work.
Because the source code of graphic user interface is too long. I just put it into the zip file. Please download the program and run it under MATLAB 6.5. Download all source code here – digitizeit



Tags: , , , , , , ,

Build a boolean network in MATLAB

To build a boolean network is pretty simple in MATLAB. It does not need ODE function to solve the problem. Just put the rate of change logic equations in a ordinary equation, to calculate several steps (step size always 1). You can get the answer. The sample code is provided here.

function test;
a = 1;
b = 1;
c = 1;
for i = 1:10
    [a, b, c] = calabc(a, b, c);
    [a b c]

function [a,b,c] = calabc(a0, b0, c0)
a = b0;
b = a0 | c0;
c = (a0 & b0) | (b0 & c0) | (a0 & c0);

Tags: ,

Concept of using multiple PCs within LAN for parallel computing and its implementation in MATLAB



  1. 减少函数的调用,不仅仅指自己写的函数,对内建函数的调用也要尽可能地减少;
  2. 不用或少用全局变量以及结构变量;
  3. 预先申请数据空间,减少动态改变数组大小的语句;
  4. 将所有调用函数集中到一个函数中,使用mcc命令将其编译成mex格式的可执行文件;
  5. 尽可能将循环化为向量,然后进行矩阵运算;




function [dap] = master(x,xdata,nummachines)
if nargin < 3
    nummachines = 6;

for i=2:nummachines
    save(['slave' num2str(i) 'ok.txt'],'i','-ascii');
switch nummachines
    case 1
        slave1data = xdata;
    case 4
        slave1data = xdata(:,1:28);
        slave2data = xdata(:,29:48);
        slave3data = xdata(:,49:60);
        slave4data = xdata(:,61:end);
    case 6
        slave1data = xdata(:,1:20);
        slave2data = xdata(:,21:36);
        slave3data = xdata(:,37:47);
        slave4data = xdata(:,48:55);
        slave5data = xdata(:,56:66);
        slave6data = xdata(:,67:end);

for i=2:nummachines
    save(['slave' num2str(i) 'x.txt'],'x','-ascii');
    save(['slave' num2str(i) 'data.txt'],['slave' num2str(i) 'data'],'-ascii');

for i=2:nummachines
    delete(['slave' num2str(i) 'ok.txt']);

dap = atclocktrainfun(x,slave1data);

for i=2:nummachines
    while ~exist(['slave' num2str(i) 'result.txt'])

for i=2:nummachines
    while exist(['slave' num2str(i) 'resultok.txt'])

for i=2:nummachines
    dap = [dap load(['slave' num2str(i) 'result.txt'])];
    delete(['slave' num2str(i) 'result.txt']);


function slave(i)
STOP = 0;
save(['slave' num2str(i) 'resultok.txt'],'i','-ascii');
while ~STOP
    STOP = exist('stop.txt');
    if STOP, break; end

    while ~exist(['slave' num2str(i) 'x.txt'])
        STOP = exist('stop.txt');
        if STOP, break;end
    while ~exist(['slave' num2str(i) 'data.txt']);
        STOP = exist('stop.txt');
        if STOP, break;end

    while exist(['slave' num2str(i) 'ok.txt']);
        STOP = exist('stop.txt');
        if STOP, break;end
    x = load(['slave' num2str(i) 'x.txt']);
    xdata = load(['slave' num2str(i) 'data.txt']);
    delete(['slave' num2str(i) 'x.txt']);
    delete(['slave' num2str(i) 'data.txt']);

    target = atclocktrainfun(x,xdata);
    save(['slave' num2str(i) 'result.txt'],'target','-ascii');
    delete(['slave' num2str(i) 'resultok.txt']);

Tags: , , , , , , ,

Downhill Simplex Method in Multidimensions

In this article, C program of the downhill simplex method in Numeric Recipe in C – the Art if Scientific computing (Second Edition, Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P., 1992, Cambridge University Press, New York) was translated into MATLAB code. The following text comes from the book. (p.408-410)

The downhill simplex mehtod is due to Nelder and Mead. The mothod requires only function evaluations, not derivatives. It is not very efficient in terms of the number of function evaluations that it requires.
A simplex if the geometrical figure consisting, in N dimensions, of N+1 points (or vertices) and all their interconnecting line segments, polygonal faces, etc. In two dimensions, a simplex is a triangle. In three dimensions it is a tetrahedron, not necessarily the regular tetrahedron. In general we are only interested in simplexes that are nondegenerate, i.e., that enclose a finite inner N-dimensional volume. If any point of a nondegenerate simplex is taken as the origin, then the N other points define vector directions that span the N-dimensional vector space.

In one-dimensional minimizations, it was possible to bracket a minimum, so that the success of a subsequent isolation was guaranteed. There is no analogous procedure in multidimensional space. For multidimensional minimization, the best we can do is give our algorithm a starting guess, that is, an N-vector of independent variables as the first point to try. The algorithm is then supposed to make iss own waydownhill through the unimaginable complexity of an N-dimensional topography until it encounters a (local, at least) minimum.
The downhill simplex method must be started not just with a single point, but with N+1 points, defining an initial simplex. If you think of one of these points (it matters not which) as being your initial starting point P0, then you can take the other N points to be
Pi = P0 + Lei
where the ei’s are N unit vectors, and where L is a constant which is your guess of the problem’s characteric length scale. You could have different Ls for each vector direction.

The downhill simplex method now takes a series of steps, most steps just moving the point of the simplex where the function if largest (“highest point”) through the opposite face of the simplex to a lower point. These steps are called reflections, and they are constructed to conserve the volume of the simplex (hence maintain its nondegeneracy). When it can do so, the method expands the simplex in one or another direction to take larger steps. When it reaches a valley floor, the method constracts itself in the transerve direction and tries to ooze down the valley. If there is a situation where the simplex itself in all directions, pulling itself in around its lowest (best) point. The routine name amoeba is intended to be descriptive of this kind of behavior.
Termination criteria can be delicate in any multidimensional minimization routine. Without bracketing, and with more than one independent variable, we no longer have the option of requiring a certain tolerance for a single independent variable. We typically can identify one cycle or step of our multidimensional algorithms. It is then possible to terminate when the vector distance moved in that step is fractionally small in magnitude than some tolerance tol. Alternatively, we could require that the descrease in the function value in the terminating step be fractionally smaller than some tolerance ftol. Note that while tol should not usually be smaller than the square root of the machine precision, it is perfectly approciate to let ftol be of order the machine precision (or perhaps slightly larger so as not to be diddled br roundoff).

Note well that either of hte above criteria might be fooled by a single anomalous step that, for one reason of another, failed to get anywhere. Therefore, it is frequently a good idea to restart a multidimensional minimization routine at a point where it claims to have found a minimum. For this restart, you should reinitialize any ancillary input quantities. In the downhill simplex method, for example, you should reinitialize N of N + 1 vertices of the simplex again by the above equation, with P0 being one of the vetices of the claimed minimum.


function [p,y,nfunk] = amoeba(funk, p, y, ndim, ftol, NMAX)
%function [p,y,nfunk] = amoeba(funk, p, y, ndim, ftol, NMAX)
% funk - the name of the function that you want to minimize
% p - the N+1-by-N matrix, row stands for the vertices, column stands for
% the variables
% y - the function values of N+1 vertices
% ndim - the number of variables
% ftol - functional convergence tolerance
% NMAX - the maximum number of function evaluations
%   p,y has the same meaning
%   nfunk - the number of function evaluations
nfunk = 0;
TINY = 1e-20;
mpts = ndim + 1;
psum = sum(p,1);
while (1==1)
    ilo = 1;
    if y(1)>y(2)
        inhi = 2;
        ihi  = 1;
        inhi = 1;
        ihi  = 2;
    for i=1:mpts
        if (y(i) <= y(ilo)) 
            ilo = i;
        if (y(i) > y(ihi)) 
            inhi = ihi;
            ihi  = i;
        elseif (y(i) > y(inhi)) && (i ~= ihi)
            inhi = i;
    rtol = 2.0 * abs(y(ihi)-y(ilo))/(abs(y(ihi)) + abs(y(ilo))+TINY);
    if (rtol < eps)
        [y(1),y(ilo)] = swap(y(1),y(ilo));
        [p(1,:),p(ilo,:)] = swap(p(1,:),p(ilo,:));
    if (nfunk >= NMAX) 
    nfunk = nfunk + 2;
    [ytry, p, y, psum] = amotry(funk,p,y,psum,ndim,ihi,-1.0);
    if (ytry <= y(ilo))
        [ytry, p, y, psum] = amotry(funk,p,y,psum,ndim,ihi,2.0);
    elseif (ytry >= y(inhi))
        [ytry, p, y, psum] = amotry(funk,p,y,psum,ndim,ihi,0.5);
        if (ytry >= ysave)
            for i=1:mpts
                if (i ~= ilo)
                    psum=0.5*(p(i,:) + p(ilo,:));
            nfunk = nfunk + ndim;
            psum = sum(p,1);
        nfunk = nfunk - 2;

function [ytry, p, y, psum] = amotry(funk, p, y, psum, ndim,ihi, fac)
fac1 = (1.0-fac) / ndim;
fac2 = fac1 - fac;
ptry = psum .* fac1 - p(ihi,:) .* fac2;
ytry = feval(funk,ptry);
if (ytry < y(ihi))
    y(ihi) = ytry;
    psum = psum + ptry - p(ihi,:);
    p(ihi,1:ndim)= ptry;

function [a,b]=swap(a,b)


for i=1:MP
    for j=1:NP
        if i == j+1
            p(i,j) = 1.0;
            p(i,j) = 0.0;
disp(sprintf('%3s %10s %12s %12s %14s\n\n','i','x[i]','y[i]','z[i]','function'));
for i=1:MP
    ss1 = [sprintf('%3d ',i)];
    for j=1:NP
        ss1 = [ss1 sprintf('%12.6f ',p(i,j))];
    ss1 = [ss1 sprintf('%12.6f\n',y(i))];
[p,y,nfunc] = amoeba(@func,p,y,ndim,eps,1000);
disp(sprintf('\nNumber of function evaluations: %3d\n',nfunc));
disp(sprintf('Vertices of final 3-d simplex and\n'));
disp(sprintf('function values at the vertices:\n\n'));
disp(sprintf('%3s %10s %12s %12s %14s\n\n','i','x[i]','y[i]','z[i]','function'));
for i=1:MP
    ss1 = [sprintf('%3d ',i)];
    for j=1:NP
        ss1 = [ss1 sprintf('%12.6f ',p(i,j))];
    ss1 = [ss1 sprintf('%12.6f\n',y(i))];
disp(sprintf('\nTrue minimum is at (0.5,0.6,0.7)\n'));  


function [res] = func(x)
res = 0.6 - bessj0((x(1)-0.5)^2 + (x(2)-0.6)^2 + (x(3)-0.7)^2);

function [ans] = bessj0(x)
if (ax < 8.0) 
    ans1=57568490574.0+y*(-13362590354.0+y*(651619640.7 ...
    ans2=57568490411.0+y*(1029532985.0+y*(9494680.718 ...
    ans1=1.0+y*(-0.1098628627e-2+y*(0.2734510407e-4 ...
    ans2 = -0.1562499995e-1+y*(0.1430488765e-3 ...
        +y*(-0.6911147651e-5+y*(0.7621095161e-6 ...

Tags: , , ,