Thursday, December 15, 2011

ideas

while working intensively on my final project for financial engineering, I start to wonder of new questions as ideas for my future research and study:

- freemind software reminds me when data get large and people at lost, visualization can be extremely valuable. It's good to be able to brainstorm ideas down in a visual structure. It helps to organize ideas especially when working on long paper. So the questions are:
- how to encode and represent knowledge or rules in a knowledge base, ruled based system? format may be text, drawing, video... How to set up a mapping or a user customized meaningful correlation between these representations?

- how to visualize and solve abstract concepts visually?
- how to visually encode knowledge?

I don't have the answers for these questions now...because I'm busy working on this final project paper and have not thought out those questions yet! but I think sometimes it is important to come up and ask the right questions first...to drive direction for answer rather than looking for some random answer blindly...again...another question and ideas about the approach of problem solving...
- is it active questioning/asking driven? or is it "search"/answer driven? how to search if you dont know what to search for????

Thursday, October 13, 2011

Install new VM in Centrix XenCenter

Here are a few ways to install/instantiate a new VM in Centrix XenCenter. Some I have learned the hard way.

option1: install from online repositories source
ex: install CentOS. You can find some online repositories. Otherwise, you can follow instruction in User guide for XenCenter to set up your own Redhat repositories.

option2: install from DVD - ISO files
this is the common method we set up a regular machine. Somehow, at first few attempts, I have problems figuring out how to attached DVD or select install from DVD when creating new VM. The KEY is: if you select any existing template in XenCenter when create new VM, you CANNOT select the option to install from ISO library or DVD. Also, it's stupid, but don't bother about attaching DVD to VM because they already have.

the solution is to set TEMPLATE for new VM as "Other install media" in the drop down list of template. Then you will have the options enabled for install from DVD or ISO library. Alternatively, you can leave the Installation Media section empty, then once the VM is created and started the first time, in the "Console" tab dropdown, select the desired DVD or ISO library.

note:
- when create new ISO library such as CFIS/SMB windows share, specify path as //server/share with valid login account
- alternatively, can use a trick to install from URL by running a local Python built in web server serving ISOs files from local with the command:
> python -m SimpleHTTPServer

Life is good again;)

QP

Thursday, September 22, 2011

evaluating tools for statistical data mining and analysis of financial application

Many solutions existing including Matlab, C/C++. Java and Python

I choose Python for my projects because:
- scripting language = fast development
- easily translatable/migrate from Java OOP and integration through JPython or with C
- hook up well with R for stat analysis
- beautiful graphing
- free + open source
- great tutorials, documentation

Wednesday, August 17, 2011

restoring dell laptop running windows vista

My girlfriend laptop Dell XPS M1330 running vista has more likely been corrupted with virus and is mal functioning. A few inportant steps to note for myself that may not be obvious:

- backup important data by manually copying User folders to external hard drive using an ubuntu linux cd
- perform factory backup *** by clicking F8 on boot up=> system repair -> login and choose factory recovery... This step is not obvious and there is no menu option in regular windows log in to do this.

http://support.dell.com/support/topics/global.aspx/support/kcs/document?c=us&cs=RC956904&l=en&s=hied&docid=DSN_336966&isLegacy=true

Friday, August 12, 2011

A week on "network path not found" error

This whole week, I have been reading madly to set up Active Directory and Domain Name service on windows server 2008 and windows 7 clients. Learned that DNS is the KEY to understand and configure AD-DS. But what's more? before even configure DNS, set up computers to connect and share/access resource can be tricky.

The error: network path not found
when I try to join a client computer to a domain. Thought it must be some AD-DS error. But hang on, let me try to do simple file sharing...and yes...I got the same error. The frustration is that I can ping and dns test with nslookup fine the server. Even when all firewalls on server are turned off.

the solution.
Network adapter setting must be set properly for
clients: - check Client for Microsoft Networks
server: File and Printer Sharing for Microsoft Network <= ****** key solution without this no effort works - and I have figure it out today Friday after a long struggling week with nothing done but madly reading.




Tuesday, June 21, 2011

configure multiple virtual host on apache server on fedora

A few note to myself the necessary steps to configure virtual hosts on apache server running Fedora:

- first need to get DNS mapping correct first. Test ping machines/hosts to verify dns working correctly.
- configuration file = /etc/httpd/conf/httpd.conf
- check that this is the correct configuration file that control the apache server by redirect default web page to a test page and make sure it is displayed correctly. It may be necessary to use different browsers on different computers to access the web server to verify that the server is working properly. Sometimes, I find using the same browser in one machine like firefox seems to cache data and create wrong illusion that the server is always available even when I shutdown the web server.

- modify configuration file to uncomment and enable NameVirtualServer line

- virtual host means using 1 instance of apache server to server multiple web domains/hosts.

- sample configuration for one virtual host:


ServerName www.test.com
DocumentRoot /var/www/


Once >=1 virtual host is configured and used, then every host is considered a virtual host and must have its own configuration as the segment of code above.

- finally restart the web server as below:
#> /sbin/service httpd restart

Done ;)

Friday, May 20, 2011

more on R&D

possible master thesis:

-comparative contemporary programming languages
- high performance programming language
- how to integrate modern programming languages
- concurrent mutation testing

- research work on motivation for programming languages design study as motivated by the need to generalized knowledge in web development

some of my current thoughts regarding R&D directions

Having taken courses in Data Mining, Software Testing, and Parallel computing, I have been equipped with a strong foundation to move on with my research directions. Most of the knowledge I know of now, is still feel theoretical and they demand great applying to in order to solve any real world problems or write a descent thesis.

pending thought for upcoming conquer:
- concurrency programming and testing
- image processing: more application of data mining on image domain, maybe some intro to computer graphic and computer vision
- financial engineering: on simulation modeling, AI agent programming, data mining
will need to find an answer and pick one by end of summer

some R&D directions:
thesis related
- more testing concurrent/parallel/non deterministic system => concurrent programming seems to be the best fit as of now. But thinking about getting more coverage of MS computer science with image processing component and more practical domain application of data mining. On the other hand, image system is static, but financial system is dynamic and non deterministic and exist opportunity for data mining as well....testing maybe a bit hard..and can still have computer graphic in next spring

work related
- computer security - a must
- research on web application development in respect to data centric, content management system

IBM developer work has some great articles today that interest me:

use python to parse unix memory dump
http://www.ibm.com/developerworks/linux/library/l-parse-memory-dumps/

basic use of Pthread
http://www.ibm.com/developerworks/linux/library/l-pthred/index.html
=> remind me of applying my parallel computing knowledge, learn more about cache friendly algorithm

distributed concurrent programming with Erlang(!)
http://www.ibm.com/developerworks/opensource/library/os-erlang2/
=> remind me of all the new interesting language I expose to this semester and to continue my study of programming language and a vote for taking concurrent programming as multi core programming demand and combination of parallel and testing.: python, D, lua....

my upcoming project: build open source content management system
http://www.ibm.com/developerworks/opensource/tutorials/os-cms2/

automate test case with rational test manager
http://www.ibm.com/developerworks/rational/library/automate-test-cases/index.html

how to use Wikipedia API to get its info: - great potential for data mining and content management system
http://www.ibm.com/developerworks/library/x-phpwikipedia/

IBM developer work is cool with a whole sections for Linux, Java, and webdevelopment
http://www.ibm.com/developerworks/web/
=> browsing these IBM technicals inspired and reminds me to make my own collection of technical documents that I have learned and tried

interesting tutorial on learning HOW to think/program functionally
http://www.ibm.com/developerworks/java/library/j-ft1/index.html

some tutorial on personal finance on lifehacker
http://lifehacker.com/5803422/the-graduating-students-guide-to-managing-finances-and-tackling-debt

...
I start to think....since now I have well grounded...should I focus on my research topic?!


interesting tutorial javascript for Java developer
http://www.ibm.com/developerworks/java/library/j-javadev2-18/index.html

even more research direction: multithreaded data structure for parallel computinghttp://www.ibm.com/developerworks/aix/library/au-multithreaded_structures1/
=> remind me that thinking master data mining or high performance computing simply by taking a graduate course is definite not enough...APPLY >>>>APPLY>>>>APPLY

IBM hadoop
http://www.alphaworks.ibm.com/tech/idah


****** IBM Computer Science Research projects
http://domino.research.ibm.com/comm/research.nsf/pages/d.compsci.html

IBM programming language and software engineering research
https://researcher.ibm.com/researcher/view_pic.php?id=3
https://researcher.ibm.com/researcher/view_grouppubs.php?grp=3

=> remind me to make my own website with research projects FOCUS on real practical hand on project that I have done regarding programming languages..+.testing..+.high performance computing = testing high performance language?! another vote again for concurrent programming and testing. Should try to get something as descent as data mining term project going...active research.

top free open source web development tool
http://www.andrewsellick.com/34/top-15-free-and-open-source-web-developer-tools-updated

academic research in programming language design and compiler
http://www.cs.cmu.edu/~mleone/language-people.html

software testing by statistical method
http://www.itl.nist.gov/div897/ctg/stsm.htm

workshop on programming language for HPC
www.cs.uiuc.edu/~wgropp/bib/reports/HPCWPL.pdf

REMIND MYSELF: to INVOLVE in an open source project ASAP, especially project on parallel programming language such as Scala or Fortress for java, OpenJDK
http://en.wikipedia.org/wiki/Fortress_%28programming_language%29
http://openjdk.org/

Focus on my major concentration. Financial engineering can be a side reading...as open video available anyway...why pay? => more vote on taking concurrent programming, focus to improve my weak point, then take CG later

IBM University Relation: resource for students, also entrepreneurs
http://www.ibm.com/developerworks/university/

system security guideline for NSA
http://lifehacker.com/5803569/lock-down-your-computer-like-the-nsa

Friday, May 6, 2011

graphic assembly programming

http://www.gamedev.net/page/resources/_/reference/programming/140/283/graphics-programming-black-book-r1698

http://people.cs.vt.edu/~feng/index.php

http://developer.nvidia.com/cuda-spotlight-compute-cure

Thursday, May 5, 2011

Open source video lecture

in continuing example of MIT, IIT India:
http://nptel.iitm.ac.in/courses.php?branch=Comp
http://onionesquereality.wordpress.com/2008/04/30/more-video-lectures-iit-open-course-ware/

Wednesday, May 4, 2011

mail server on linux

To figure out what mail server running on linux:

alternatives --display mta

alternatives --config mta

sending email:
> sendmail emailaddress

more sendmail tutorial
http://rimuhosting.com/support/settingupemail.jsp?mta=sendmail

Sunday, May 1, 2011

open source software alternative

http://www.lostintechnology.com/windows/20-best-free-or-freemium-alternatives-to-popular-paid-software-programs

http://www.lostintechnology.com/windows/5-great-examples-of-cross-platform-open-source-software

Friday, April 8, 2011

sleeping on data mining research

This spring semester, I am working on a data mining research relating to software testing and parallel computing. Some of my current research questions:

- how can data mining help software testing?
- what are the challenge of software testing? as of now, I know that the input space for testing - or test space is infinitive - thus testing will be complicated and expensive - where I think data mining tech come into help by reducing this infinite search space into a cost effective critical small subset of tests needed?
- what is the most minimal test set- test requirement? i think a lot of research and study has come up with quite some metrics of test requirement, coverage for this... the next question is how to evaluate goodness of 1 test set to others? = how to evaluate goodness of 1 cluster of tests with another cluster? by what metrics?

- as of now, I envision:
testing challenge = infinite state input space, infinite and costly tests
desired outcome= a group of minimalist cost effective critical tests
tool = data mining clustering algorithm
project scope: clustering analysis of test space
s1= for imperative program
s2=logic programming/finite state machine
s3 = parallel and concurrent program

most of the reference library books I have read so far all talking about logics, concurrent programming and testing. The question that bothering me is what is the relation between logic and testing and concurrency?????????????????? why what what do I have to deal with logic???

Friday, March 18, 2011

A journey into VirtualBox and Cluster setup

tutorial on how to share data between host and guest
http://blogs.sun.com/tao/entry/virtual_box_shared_folder_between
http://www.dedoimedo.com/computers/virtualbox-guest-addons.html

other vm setup
http://www.virtualbox.org/manual/ch05.html#vdidetails
http://forums.virtualbox.org/viewtopic.php?t=674
http://www.virtualbox.org/manual/ch05.html#vdidetails
http://www.virtualbox.org/manual/ch05.html#cloningvdis

helpful forum
http://forums.virtualbox.org/viewtopic.php?t=674