Python: how to use Python to generate a random sparse symmetric matrix? -


how use python generate random sparse symmetric matrix ?

in matlab, have function "sprandsym (size, density)"

but how in python?

if have scipy, use sparse.random. sprandsym function below generates sparse random matrix x, takes upper triangular half, , adds transpose form symmetric matrix. since doubles diagonal values, diagonals subtracted once.

the non-zero values distributed mean 0 , standard deviation of 1. kolomogorov-smirnov test used check non-zero values consistent drawing normal distribution, , histogram , qq-plot generated visualize distribution.

import numpy np import scipy.stats stats import scipy.sparse sparse import matplotlib.pyplot plt np.random.seed((3,14159))  def sprandsym(n, density):     rvs = stats.norm().rvs     x = sparse.random(n, n, density=density, data_rvs=rvs)     upper_x = sparse.triu(x)      result = upper_x + upper_x.t - sparse.diags(x.diagonal())     return result  m = sprandsym(5000, 0.01) print(repr(m)) # <5000x5000 sparse matrix of type '<class 'numpy.float64'>' #   249909 stored elements in compressed sparse row format>  # check matrix symmetric. difference should have no non-zero elements assert (m - m.t).nnz == 0  statistic, pval = stats.kstest(m.data, 'norm') # null hypothesis m.data drawn normal distribution. # small p-value (say, below 0.05) indicate reason reject null hypothesis. # since `pval` below > 0.05, kstest gives no reason reject hypothesis # m.data distributed. print(statistic, pval) # 0.0015998040114 0.544538788914  fig, ax = plt.subplots(nrows=2) ax[0].hist(m.data, normed=true, bins=50) stats.probplot(m.data, dist='norm', plot=ax[1]) plt.show() 

enter image description here


ps. used

upper_x = sparse.triu(x)  result = upper_x + upper_x.t - sparse.diags(x.diagonal()) 

instead of

 result = (x + x.t)/2.0 

because not convince myself non-zero elements in (x + x.t)/2.0 have right distribution. first, if x dense , distributed mean 0 , variance 1, i.e. n(0, 1), (x + x.t)/2.0 n(0, 1/2). fix using

 result = (x + x.t)/sqrt(2.0) 

instead. result n(0, 1). there yet problem: if x sparse, @ nonzero locations, x + x.t distributed random variable plus zero. dividing sqrt(2.0) squash normal distribution closer 0 giving more tightly spiked distribution. x becomes sparser, may less , less normal distribution.

since didn't know distribution (x + x.t)/sqrt(2.0) generates, opted copying upper triangular half of x (thus repeating know distributed non-zero values).


Comments

Popular posts from this blog

c++ - QTextObjectInterface with Qml TextEdit (QQuickTextEdit) -

javascript - angular ng-required radio button not toggling required off in firefox 33, OK in chrome -

xcode - Swift Playground - Files are not readable -