Importing and Compatibility.
One of the most useful feature i count ” as a developer ” its database friendly , as it can import data from many database engines such as ” MySQL , MS Access , SQLite , PostgreSQL , MS SQL server ” as it plans to go for oracle as well . Importing as not just from database engines but it also can import from Office documents either , MS Excel or OpenOffice Calc .
Output : SOFA : can do outputs in simple HTML file so it can be used for internet or website or even in spreadsheet file , it produces colorful outputs of many types of graphs such as bar charts , pie charts , line charts single or multi-pule ”
Available Tests :
* Row and column percentages, with the ability to nest variables e.g look at Ethnicity and Gender vs Age
* Standard Deviation
* N items
* Pearson’s Chi-Square with Contingency Tables
* Independent samples t-test
* Paired samples t-test
* One-way ANOVA
* Mann Whitney U
* Wilcoxon Signed Ranks
* Kruskal Wallis H
* Pearson’s Correlation
* Spearman’s Correlation
Linux : Debian / Ubuntu package is listed , and for non debian based distros the linux package is available in *.tar.gz package . Windows and Mac OS X , binaries are also available in the download section .
- Sourceforge SOFA Statistics Page : http://sourceforge.net/projects/sofastatistics/
- Google Group : http://groups.google.com/group/sofastatistics?pli=1
- Lunchpad : https://launchpad.net/sofastatistics/
Deb packages are supplied for download on the main SOFA website. To cater to other flavours of Linux, a tar.gz is also provided. Inside, you will find README.txt and INSTALL.sh.
- Step 1 is to use your distro package manager to install all the required support packages e.g. matplotlib (for chart plotting). Details of required packages are in the next subsection.
- Step 2 is to run INSTALL.sh as described in README.txt.
And if you manage to get SOFA working on other distros please email me (email@example.com) the relevant package details etc and a screen-shot (preferably one which reveals the distro involved).
- python (>= 2.6.2),
- wx-common (>= 220.127.116.11),
- python-wxversion (>= 18.104.22.168),
- python-wxgtk2.8 (>= 22.214.171.124),
- python-numpy (>= 1:1.2.1),
- python-pysqlite2 (>= 1.0.1),
- python-mysqldb (>= 1.2.2),
- python-pygresql (>= 1:4.0),
- Python was already there
- wxPython-2.8.11… and that brought with it some other packages needed.
- for more recent versions of fedora you will need to separately install python-matplotlib-wx (otherwise you get a message about “No module named backend_wxagg”)
- python-wxGTK 126.96.36.199…
- python-numpy (NB to upgrade the existing version 1.3… to the later education repo version 1.5… - see Python matplolib on openSUSE)
- python-mysql 1.2.2-90.1
- PyGreSQL 3.8.1…
- python-matplotlib 1.0.0…
- python-sqlite2 2.6.0…
- python-webkit (upgraded)
- python-webkitgtk 1.1.8… (to avoid error about backend_wxagg module being missing)
- Name: SOFA Statistics
- Description: Analysis package
- Command: python /usr/local/share/sofa/start.py
- Icon: /usr/local/share/sofa/images/sofa_48x48.xpm
If you are able to get SOFA to launch at all, but there is a problem of some sort, look at the output.txt file in your /home/username/sofa/_internal folder. It may be, for example, that you forgot to install matplotlib.
#! /bin/bash python /usr/local/share/sofa/start.pyAnd save it e.g. to your home folder. If bash is not located in /bin/bash on your system, use the command
which bashto find it.
Then make a symlink to it located in /usr/local/bin (NB give everyone rights to run it)
su root<br />ln -s /home/username/runsofastats.sh /usr/local/bin/sofastats<br />chmod a+x /usr/local/bin/sofastatsNow you can run SOFA Statistics from the command line by typing in
sofastatsSee Linux by example - how to create symlink?
/usr/local/share/sofa<br />/usr/local/share/sofa/boomslang<br />/usr/local/share/sofa/css<br />/usr/local/share/sofa/dbe_plugins<br />/usr/local/share/sofa/googleapi<br />/usr/local/share/sofa/googleapi/atom<br />/usr/local/share/sofa/googleapi/gdata<br />/usr/local/share/sofa/googleapi/gdata/docs<br />/usr/local/share/sofa/googleapi/gdata/oauth<br />/usr/local/share/sofa/googleapi/gdata/spreadsheet<br />/usr/local/share/sofa/googleapi/gdata/tlslite<br />/usr/local/share/sofa/googleapi/gdata/tlslite/integration<br />/usr/local/share/sofa/googleapi/gdata/tlslite/utils<br />/usr/local/share/sofa/images<br />/usr/local/share/sofa/_internal<br />/usr/local/share/sofa/locale<br />/usr/local/share/sofa/locale/gl_ES<br />/usr/local/share/sofa/locale/gl_ES/LC_MESSAGES<br />/usr/local/share/sofa/projs<br />/usr/local/share/sofa/reports<br />/usr/local/share/sofa/reports/sofa_report_extras<br />/usr/local/share/sofa/scripts<br />/usr/local/share/sofa/vdts/In the following example, I downloaded the sofa source code into the Downloads folder in Fedora 14.
Then extract contents of sofa_0.9.21.orig.tar.gz into Downloads folder.
The next lot of commands were performed as root (NB the /* after sofa.main)
cd Downloads/sofa/sofa_0.9.21.orig<br />cp -r sofa /usr/local/share<br />cp -r sofa.main/* /usr/local/share/sofa<br />cp runsofastats.sh /usr/local/share/sofaIn versions prior to 0.9.22, the file permissions were all incorrect. Here is how to fix them:
su root<br />chmod -R u=rwx /usr/local/share/sofa<br />chmod -R go=rx /usr/local/share/sofaNB nothing will work without the dependencies installed. Running:
python /usr/local/share/sofa/start.pywill return a traceback because wxversion or whatever isn't available. So the next step is installing the dependencies.
After installing wxPython, but before adding the other dependencies, running sofa prematurely will result in a message about a problem with the first round of local importing.
This brings up the data selection dialog. Here you can look at existing data tables or make new ones. Here we just want to look at the demonstration data table “demo_tbl”. Click on “Open”.
Here you can see the data we will be test analysing using SOFA Statistics. Note the pale blue column - the background colour indicates the field is read-only. Typically, read-only fields are autonumbered or timestamps.
Click on “Close” when you're finished looking.
Let's start with a simple report table of Age Group vs Country. NB all of this data is fictitious and designed to allow features of the program to be demonstrated.
- For “Table Type” select “Crosstabs”. A cross tabulation shows one or more variables against one or more other variables e.g. Age Group in the rows and Country in the columns.
- We need to add a row so click on “Add” under the “Rows” label
- Select “Age Group” and either double click it or select “OK”.
In the demonstration pane below you will see a rough illustration of what the table will look like. If you want to see the actual table, click on “Run”.
If “Add to report” is ticked, the output will also be saved to the end of the output file specified at the bottom of the form.
- Click on “Config” under the “Columns” label
- Tick “Total” under the “Misc” heading
- Tick “Column %” and “Row %” under the “measures” heading
- Click on “OK” to see changes in demonstration table. NB to see actual results, click on “Run”.
If you click “Run” with “Add to report” ticked, you can view the result by clicking on the “View” button. This will open your default web browser so you can see the output.
The styling of your table can also be changed - here are some examples of different report tables:
Documentation on making report tables is extended in Making Report Tables
Then click on the “CONFIGURE TEST” button (ANOVA should already be selected).
Let's look at whether there is a difference between the average ages in the 3 different countries. NB all the data here is fictitious and only for example purposes.
- Select the variable that is averaged (the one we think might vary between groups). In this case, select “Age”.
- Select the variable with the groups. In this case, select “Country” and then select “Group A” and “Group B”.
- Click on “Run” to see results.
In this case, there is probably a real difference (p has a vary small value). Looking at the mean age for each group and the distribution for each group will help us decide how important the difference is for the purpose at hand. NB a difference can be statistically significant and clinically/politically/practically etc insignificant.