DBMS Architecture Research

In case you are cooperating with non-DA-members that are (temporarily) granted access to our SciLens cluster (there are no non-DA people that have access to our SciLens cluster without cooperation with us!), please make sure to (1) inform them about our usage (reporting/claiming) policies (see above) and (2) introduce them and their work to the group

If you plan to use machines in our SciLens cluster please make sure you report your usage (plans) and claim machines the usual way via the webpages below. Please do not forget to release them, again, once you're done with using them! Do not hesitate to ask, in case you have any questions!

SciLens backup[]

The data on the SciLens machines is not backed-up by ITF. To ensure a continual use of your data, you either have to find a backup storage place within the cluster, your desktop, external, or be able to re-generate your settings. In case of doubt or advice where and how to make your backup, contact Arjen.

SciLens login[]

Logging into the cluster is regulated through scilens-ssh ! For this you have to contact Arjen de Rijke.

Thereafter you can ssh to the scilense-ssh portal machine:

ssh -A -i ~USER/.ssh/id_rsa_scilens_USER scilens-ssh.ins.cwi.nl

and subsequently move to the specific machine you desire to use, e.g.

ssh bricks09

You land on the home directory created for you on this machine. It has a small disk mounted and should not be used with the expection of environment resource scripts and configuration files. The disk is small and shared with all users and deleted during a re-install.

For most practical cases, you should make yourself a directory in /scratch or /data where there should be ample of disk space.

SciLens data transport[]

The easiest way is to PULL small files using scp into the selected Scilens machine. For example,

scp vienna.ins.cwi.nl:/ufs/mk/.monetdb .

[Arjen update next] This requires the ssh configuration file for the scilens-ssh machine on your deskop to be set to work as a forwarding device.

scp vienna.ins.cwi.nl:/ufs/mk/.monetdb .

For example, .../.ssh/config looks like:

Host scilens-ssh.ins.cwi.nl
        User <USER>

Scilens local data transport[]

Copying files within the cluster is straightforward using the scp command identifying the machine and directory locations involved, e.g. while on bricks09 you can move the file data to bricks10

scp data bricks10/scratch/USER

You can also use the rsync command to clone your environment easily on multiple machines. It requires a .rsync file in your homedirectory with directives. Create a rsync daemon configurationfile in your /scratch/USER directory containing:

port = 2873
use chroot = no
path = /scratch/USER

The port number is free to choose above 1000, otherwise conflicts with others users might occur. Let's assume again you are on brick09 and you want a synchronized copy on bricks10. On bricks09 create the above configuration file and start the rsync daemon with the command:

rsync --daemon --config=/scratch/USER/rsyncd.conf

Then login onto bricks10 in your /scratch/USER directory execute the command:

rsync -aH bricks09:/scratch/USER/* /scratch/USER/

For further details see the rsync manual or contact the expert. After the rsync is complete you should terminate the rsync daemon on bricks09.

Returning small files from the SciLens machine to your desktop can be performed using scp. For big ones contact the expert. Importing large amounts of data into the cluster, contact the expert.

Export data from the cluster[]

For small files on any of the machines, you first copy it to the scilens-ssh machine, whereafter you can pick it up from your desktop using scp.

For large exports we need to design a better solution.

Accessing the internet[]

You can use wget to access any web-source from within the cluster.

You have to contact the expert if you want to run a web-client/server setup on the machine. For this you need a ssh-tunnel setup.

Root specific features[]

Some tools have been added to make your life easier wrt performance analyse. They have to be called with the sudo command, example

sudo iotop ...

Pre-installed user requested packages[]

External software that requires root permissions can only be installed from source in your local environment. For example, postgresql, mysql and friends.

Non-installed libraries available within the Fedora distributions can be enabled by contacting the expert.

Specific application frameworks, e.g. java, can be installed from the Fedora repository, but due to versioning issues we advice to use a local copy as much as possible. When in doubt follow the expert route.

Work with multiple machines[]

With the command 'clush', one can execute the same command on multiple SciLens machines simultaneously. Assume you are logged in scilens-ssh, then run:

clush -w bricks[01-16]

This will give you a prompt. Any command you type in here, e.g., 'df -h /scratch', will be run on the machines bricks01, ..., bricks16. Try it out to see the results produced by the selected bricks machines.

The '-w' option allows you to pass a list of machines you want to work on.

In theorie, you can address all rocks machines with one clush command: clush -w rocks[001-144]. However, in practice, it is more advisable to work with smaller groups of machines, say 20. This also makes it easier to cancel a command if one of the machines freeze.

With 'quit' you can stop clush. For more information, please see the man page of clush.