Showing posts with label ssh. Show all posts
Showing posts with label ssh. Show all posts

Wednesday, February 1, 2017

Remove known host from Secure Shell

So, I have an old chromebook that I use as a thin client (first generation - no, not a CR-48).  Problem is, RSA keys change and I don't have a proper shell.  Just 'secure shell'.  This command removes a known host from the known_hosts file.  src.

Open up Secure Shell as a browser tab.  Go
'that-funny-button-for-misc-options' --> More Tools--> Developer Tools
Go to the Console tab.  In the console, enter the following, where foo is the index number of the host record that you need to remove.
term_.command.removeKnownHostByIndex(foo)

Thursday, January 26, 2017

(Internet via Ethernet) + (SSH via wireless)

= (sustained SSH during MPICH computing)

I've been losing SSH connection after starting process jobs on my new beowulf cluster.  This is my current fix, since my theory is that the network switch is so clogged with MPI-related communication (which does take place via ssh) that there's no bandwidth left for my administrative SSH connection.  Theory supported by observation that when I plug my unrelated control machine into the switch it can't ping google.

assumptions

  • Ubuntu 16.04.1 LTS
  • working wireless and ethernet: I had to do this and this.

sources

check wifi hardware capability

Run the command 
iw list
And look for a section like the following.  If it includes 'AP' (see emboldened bit), you're golden.  If not, look for a different wireless card.
Supported interface modes:  
         * IBSS 
         * managed  
         * AP 
         * AP/VLAN  
         * monitor 

install dependencies

sudo apt-get install rfkill hostapd hostap-utils iw dnsmasq   

identify interface names

As of ubuntu 16.04, the standard wlan0 and eth0 interface names are no longer in use.  You'll have to identify them specifically.  Use the following command, which lists the contents of the folder for each interface device, and look for the device that has a folder named 'wireless'. src.
ls /sys/class/net/*
Observe the assumptions above to see what I'm calling them.

configure wifi settings

There's three files you'll have to configure.  Since I'm logged in via ssh, I don't want to interrupt my connection until I've created a new access point I can connect to.  So I'll walk through editing each file in turn, then I'll have one command at the end that activates all the changes.  

configure wireless interface: /etc/network/interfaces

Backup your current interface file.
sudo cp /etc/network/interfaces /etc/network/interfaces.bak
and then edit the original
sudo nano /etc/network/interfaces
replace the contents of the file - change the interface names as appropriate.
auto lo
iface lo inet loopback

auto enp2s0
iface enp2s0 inet dhcp

auto wlp1s0
iface wlp1s0 inet static
hostapd /etc/hostapd/hostapd.conf
address 192.168.3.14
netmask 255.255.255.0
Normally I'd say that here's where you restart the interface, but we're saving that for the end.

configure the access point: /etc/hostapd/hostapd.conf

backup the original file - it's ok if there's nothing there.
sudo cp /etc/hostapd/hostapd.conf /etc/hostapd/hostapd.conf.bak
edit the original
sudo nano /etc/hostapd/hostapd.conf
put this in:
interface=wlp1s0
driver=nl80211
ssid=test
hw_mode=g
channel=1
macaddr_acl=0
auth_algs=1
ignore_broadcast_ssid=0
wpa=3
wpa_passphrase=1234567890
wpa_key_mgmt=WPA-PSK
wpa_pairwise=TKIP
rsn_pairwise=CCMP
Inexplicably, this only seems to produce a detectable wifi access point when the ssid is 'test'.  I tried several other non-keyword names, and none of them worked.  Go back to 'test', and it worked.  Did it several times...magic.

Save and exit.

configure the DHCP server

this is where the access point actually becomes something you can access. backup: 
sudo cp /etc/dnsmasq.conf /etc/dnsmasq.conf.bak
edit original - since the file is so big, I rm'd the original and pasted the contents below into an empty file.
sudo rm /etc/dnsmasq.conf
sudo nano /etc/dnsmasq.conf
make it look like this:
# Never forward plain names (without a #dot or domain part)
domain-needed

# Only listen for DHCP on wlan0
interface=wlp1s0

# create a domain if you want, comment #it out otherwise
# domain=Pi-Point.co.uk

# Create a dhcp range on your /24 wlp1s0 #network with 12 hour lease time
dhcp-range=192.168.3.15,192.168.3.254, 255.255.255.0,12h
Save and exit.

implement changes

this is going to be one big command. if it works, you're in business...if it doesn't, you'll have to login directly to the machine for troubleshooting.
sudo ifdown wlp1s0; sudo ifup wlp1s0; sudo service hostapd restart; sudo service dnsmasq restart
Worked for me: I now have a secondary wireless access to my beowulf cluster for when the ethernet gets clogged with MPI signals.



Saturday, January 21, 2017

A Homemade Beowulf Cluster: Part 2, Machine Configuration

This section starts with a set of machines all tied together with an ethernet switch and running Ubuntu Server 16.04.1.  If the switch is plugged into the local network router, then the machines can be ssh'd into.

This should be picking up right where Part 1 left off. src.  So.

Enabling Scripted, Sudo Remote Access

The first step in the configuration process is to modify the root-owned host files on each machine.  I'm not doing that by hand, and I've already spent way too long trying to find a way to edit root-owned files through ssh automatically.

It's not possible without "security risks".  Since this is a local cluster, and my threat model doesn't include -- or care about -- people hacking in to the machines or me messing things up, I'm going the old fashioned way.  I also don't care about wiping my cluster accidentally, since I'm documenting the exact process I used to achieve it (and I'm making backups of any data I create).

Log into each machine in turn, and enter the password when prompted.
ssh beowulf@grendel-[X]
Recall that the password is
hrunting
Create a password for the root account.
sudo passwd root
At the prompt, enter your password.  We'll assume it's the same as the previously-defined user.
hrunting
Now the root account has a password, but it's still locked.  Time to unlock it.
sudo passwd -u root 
Note: if you ever feel like locking the root account again, run this:
sudo passwd -l root
Now you have to allow the root user to login via ssh.  Change an option in this file:
sudo nano /etc/ssh/sshd_config
Find the line that says:
PermitRootLogin prohibit-password
and comment it out (so you have a record of the default configuration) and add a new line below it. They should look like this:
#PermitRootLogin prohibit-password
PermitRootLogin yes
[CTRL-O] and [CTRL-X] to exit, then run:
sudo service ssh restart
That's it!  Now we can use sshpass to automatically login to the machines and modify root files.  Be careful; there is nothing between you and total destruction of your cluster.

Upload a custom /etc/hosts file to each machine

I created a script to do this for me.  If I could have found a simple way to set static IPs that would have been preferable, but this way I don't have to manually rebuild the file every time the cluster is restarted.

Note: for now, this isn't compatible with my example - it only uses node increments of digits, while my example is using letters (grendel-b vs grendel-1).  I'll fix that later.  For now, I'd recommend reading all the way to the end of the walkthrough before starting, and just using numbers for your node increments.

Run the script from a separate computer that's on the local network (i.e., that can ssh into the machines), but which isn't one of the machines in the cluster.  Usage of the script goes like this:
bash create_hosts_file.sh [MACHINE_COUNT] [PASSWORD] [HOSTNAME_BASE]
Where HOSTNAME_BASE is the standard part of the hostname of each computer - if the computers were named grendel-a, grendel-b, and grendel-c, then the base would be "grendel-".

So, continuing the example used throughout and pretending there's 5 machines in total, this is what the command would look like:
mkdir -p ~/scripts && cd scripts
wget https://raw.githubusercontent.com/umhau/cluster/master/create_hosts_file.sh
bash create_hosts_file.sh 5 "hrunting" "grendel-"
If you don't get any errors, then you're all set! You can check the files were created by ssh'ing into one of the machines and checking /etc/hosts.
ssh beowulf@grendel-a
cat /etc/hosts
The output should look something like this:
127.0.0.1     localhost
192.168.133.100 grendel-a
192.168.133.101 grendel-b
192.168.133.102 grendel-c
192.168.133.103 grendel-d
If it doesn't look like that, with a line for localhost and one line after it for each machine, you're in trouble.  Google is your friend; it worked for me.

Creating a Shared Folder Between Machines

This way, I can put my script with fancy high-powered code in one place, and all the machines will be able to access it.

First, dependencies.  Install this one just on the 'master' node/computer (generally, the most powerful computer in the cluster, and definitely the one you labelled #1).
sudo apt-get install nfs-server
Next, install this on all the other machines:
sudo apt-get install nfs-client
Ok, we need to define a folder that can be standardized across all the machines: same idea as having a folder labeled "Dropbox" on each computer that you want your Dropbox account synced to - except in this case, the syncing is a little different.  Anything you put in the /mirror folder of the master node will be shared across all the other computers, but anything you put in a /mirror folder of the other nodes will be ignored.  That's why it's called a 'mirror' - there's a single folder that's being 'mirrored' by other folders.

We'll put it in the root directory.  Since we're mirroring it across all the machines, call it 'mirror'. Do this on all the machines:
sudo mkdir /mirror
Now go back to the master machine, and tell it to share the /mirror folder to the network: add a line to the /etc/exports file, and then restart the service.
echo "/mirror *(rw,sync)" | sudo tee -a /etc/exports
sudo service nfs-kernel-server restart
Maybe also add the following to the (rw,sync) options above:

  • no_subtree_check: This option prevents the subtree checking. When a shared directory is the subdirectory of a larger filesystem, nfs performs scans of every directory above it, in order to verify its permissions and details. Disabling the subtree check may increase the reliability of NFS, but reduce security.
  • no_root_squash: This allows root account to connect to the folder.

Great!  Now there's a folder on the master node on the network that we can mount and automatically get stuff from.  Time to mount it.

There's two ways to go about this - one, we could manually mount on every reboot, or two, we could automatically mount the folder on each of the 'slave' nodes.  I like the second option better.

There's a file called the fstab in the /etc directory.  It means, 'file system tabulator'.  This is what the OS uses on startup to know which partitions to mount.  What we're going to do is add another entry to that file - on every startup, it'll know to mount the network folder and present it like another external drive.

On each non-master machine (i.e., all the slave machines) run this command to append a new entry to the bottom of the fstab file.  The bit in quotes is the part getting added.
echo "grendel-a:/mirror    /mirror    nfs" | sudo tee -a /etc/fstab
That line is telling the OS a) to look for a drive located at grendel-a:/mirror, b) to mount it at the location /mirror, and c) that the drive is a 'network file system'.  Remember that if you're using your own naming scheme to change 'grendel-a' to whatever the hostname of your master node is.

Now, in lieu of rebooting the machines, run this command on each slave machine to go back through the fstab and remount everything according to whatever it (now) says.
sudo mount -a

Establishing A Seamless Communication Protocol Between Machines

Create a new user

This user will be used specifically for performing computations.  If beowulf is the administrative user, and root is being used as the setting-stuff-up-remotely-via-automated-scripts user, then this is the day-to-day-heavy-computations user.

The home folder for this user will be inside /mirror, and it's going to be given the same userid across all the accounts (I picked '1010') - we're making it as identical as possible for the purposes of using all the machines in the cluster as a single computational device.

We'll call the new user 'breca'.  Just for giggles, let's make the password 'acerb'.  Run the first command on the master node first, and the slaves afterwards.
useradd --uid 1010 -m -d /mirror/breca breca
Set a password.  Run on all nodes.
passwd breca
Add breca to the sudo group.
sudo adduser breca sudo
Since 'breca' will be handling all the files in the /mirror directory, we'll make that user the owner.  Run this only on the master node.
sudo chown -R breca: /mirror

Setting up passwordless SSH for inter-node communication

Next, a dependency.  Install this to each node (master and slaves):
sudo apt­-get install openssh-server
Next, login to the new user on the master node.
su - breca
On the master node, generate an RSA key pair for the breca user.  Keep the default location.  If you feel like it, you can enter a 'strong' passphrase, but we've already been working under the assumption security isn't important here.  Do what you like; nobody is going after your cluster (you hope).
ssh-keygen -t rsa
Add the key to your 'authorized keys'.
cd .ssh
cat id_rsa.pub >> authorized_keys
cd
And the nice thing is, what you've just done is being automatically mirrored to the other nodes.

With that, you should have passwordless ssh communication between all of your nodes.  Login to your breca account on each machine:
su - breca
 After logging in to your breca account on all of your machines, test your passwordless ssh capabilities by running -- say, from your master node to your first slave node --
ssh grendel-b
or from your second slave node into your master node:
ssh grendel-a
The only thing you should have to do is type 'yes' to confirm that some kind of fingerprint is authentic, and that's a first-time-only sort of thing.  However, because confirmation is requested, you have to perform the first login manually between each machine.  Otherwise communication could/will fail.  I haven't checked if it's necessary to ensure communication between slave nodes, so I did those too.

Note that since the same known_hosts file is shared among all the machines, it's only ever necessary to confirm a machine once.  So you could just log into all the machines consecutively from the master node, and once into the master node from one of the slaves, and all the nodes would thereafter have seamless ssh communication.

Troubleshooting

This process worked for me, following this guide exactly, so there's no reason it wouldn't work for you as well.  If a package is changed since the time of writing, however, it may fail in the future.  See section 7 of this guide to set up a keychain, which is the likely solution.

If, after rebooting, you can no longer automatically log into your breca account within the node (master-to-slave, etc.) the /mirror mounting procedure may have been interrupted.  i.e., possibly a network disconnect when /etc/fstab was executed such that grendel-a:/mirror couldn't be found.  If that's the case, the machines can't connect without passwords because they don't have access to the RSA key stored in the missing /mirror/calc/.ssh directory.  Log into each of the affected machines and remount everything in the fstab.
sudo mount -a

Installing Software Tools

You've been in and out of the 'beowulf' and 'breca' user accounts while setting up ssh.  Now it's time to go back to the 'beowulf' account.  If you're still in the breca account, run:
exit
These are tools the cluster will need to perform computations.  It's important to install all of this stuff prior to the MPICH2 software that ties it all together - I think the latter has to configure itself with reference to the available software.

If you're going to be using any compilers besides GCC, this is the time to install them.

This installs GCC.  Run it on each computer.
sudo apt-get install build-essential
I'm probably going to want Fortran as well, so I'm including that.
sudo apt-get install gfortran

Installing MPICH

And now, what we've all been waiting for: the commands that will actually make these disparate machines act as a single cluster.  Run this on each machine:
sudo apt-get install mpich
You can test that the install completed successfully by running:
which mpiexec
which mpirun
The output should be:
/usr/bin/mpiexec
and
/usr/bin/mpirun

The Machinefile

This 'machinefile' tells the mpich software what computers to use for computations, and how many processors are on each of those computers.  The code you run on the cluster will specify how many processors it needs, and the master node (which uses the machinefile) will start at the top of the file and work downwards until it has found enough processors to fulfill the code's request.  

First, find out how many processors you have available on each machine (the output of this command will include virtual cores).  Run this on each machine.
nproc
Next, log back into the breca user on the master node:
su - breca
Create a new file in the /mirror directory of the master node and open it:
touch machinefile && nano machinefile 
The order of the machines in the file determines which will be accessed first.  The format of the file lists the hostnames with the number of cores they have available.
grendel-c:4
grendel-b:4
grendel-a:4
You might want to remove one of the master node's cores for control purposes.  Who knows?  Up for experimentation.  I put the master node last for a similar reason.  The other stuff can get tied up first.

You should be up and running!  What follows is a short test to make sure everything is actually up and running.

Testing the Configuration

Go to your master node, and log into the breca account.
ssh beowulf@grendel-a
su - breca
cd into the /mirror folder.
cd /mirror
Create a new file called mpi_hello.c
touch mpi_hello.c && nano mpi_hello.c
Put the following code into the file, and [ctrl-o] and [ctrl-x] to save and exit.
#include <stdio.h>
#include <mpi.h>

int main(int argc, char** argv) {
    int myrank, nprocs;

    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
    MPI_Comm_rank(MPI_COMM_WORLD, &myrank);

    printf("Hello from processor %d of %d\n", myrank, nprocs);

    MPI_Finalize();
    return 0;
}
Compile the code with the custom MPI C compiler:
mpicc mpi_hello.c -o mpi_hello
And run.
mpiexec -n 11 -f ./machinefile ./mpi_hello
Here's a breakdown of the command:
mpiexec              command to execute an mpi-compatible binary

-n 11                the number of cores to ask for - this should not be
                     more than the sum of cores listed in the machinefile

-f ./machinefile     the location of the machinefile

./mpi_hello          the name of the binary to run
If all went as hoped, the output should look like this:
Hello from processor 0 of 11
Hello from processor 1 of 11
Hello from processor 2 of 11
Hello from processor 3 of 11
Hello from processor 4 of 11
Hello from processor 5 of 11
Hello from processor 6 of 11
Hello from processor 7 of 11
Hello from processor 8 of 11
Hello from processor 10 of 11
Hello from processor 11 of 11
Make sure sum of the number of processors you listed in your machinefile corresponds to the number you asked for in the mpiexec command.

Note that you can totally ask for more processors than you actually listed - the MPICH sofware will assign multiple threads to each core to fulfill the request.  It's not efficient, but better than errors.

And that's it!  You have a working, tested beowulf cluster.

Tuesday, November 22, 2016

Running GUI programs over ssh command line

I've never run into this before, and it's really cool.
ssh -X pi@raspberrypi
This allows the remote computer to use the local computer's x server to run the gui for the remote program.  Sooo, I can use Mathematica on my Mint 18 laptop through an ssh connection to my RPi 2.  Very cool.




Monday, November 14, 2016

Setting up Raspberry Pi Zero in Headless Mode

Download the raspbian image, dd it to the sd card, unmount and remount it on the computer.

Figure out the  name of your micro sd card (I'm going to assume that it's sdb) - look for the device with a total size similar to what yours is labeled as.
lsblk
Download the image here.  Normally I'd give you a wget command, but it's a lot faster doing it on the browser.  Grab the "Raspbian Jessie with Pixel" version...I don't trust the Lite to have everything needed for future projects.

Extract the file (right-click and select either 'extract' or 'open with archive manager') and put it in the Downloads folder.  If you rename it to raspbian.img after extracting, then you can cut and paste the following command. Note that you should change 'foo' to the device name you identified above, e.g., of=/dev/sdb.  I'm keeping it as of=/dev/foo to prevent accidents.
sudo dd bs=4M if=/home/$USER/Downloads/raspbian.img of=/dev/foo status=progress
Remove the sd card from the computer and plug it back in.  That seems to allow the low-level processes to reassess the contents of the card after everything on it was changed.

Resize the partition in gparted if you're planning on doing anything that involves a significant amount of storage.  The Raspbian image doesn't leave a lot of free space, and your card probably has extra room on it.  Install gparted below, then run and resize (that's a GUI operation, so no walkthrough - sorry.  Also, I've been doing from nearly the beginning of my Linux career and don't need any reminders on how to do it).
sudo apt-get install gparted
Set up the networking - you'll need to tell the pi what the wifi password is (Source).  cd into the primary partition of the sd card - you'll find it as the folder named with a long string of meaningless text if you do:
ls /media/$USER
cd into that folder (represented by me here with x's and dashes) and then go several folders deeper to edit the wifi config file.  The whole thing is done below.
sudo nano /media/$USER/xxxx-xxxx-xxxxx/etc/wpa_supplicant/wpa_supplicant.conf
Add this to the end of the file, where "foo" is the name of your wifi network (literally, what shows up when you're choosing a network to connect to - it is case-sensitive) and "bar" is the wifi password. Both should be in quotation marks.
network={
  ssid="foo"
  psk="bar"
  proto=RSN
  key_mgmt=WPA-PSK
  pairwise=CCMP
  auth_alg=OPEN
}
Save and exit by pressing CTRL-O, ENTER and CTRL-X.

Unmount, stick the ssd in the pi, connect a wifi adapter and plug in the power.  Most of the time you don't even need to know the IP of the pi, as you can use the computer name to ssh in:
ssh pi@raspberrypi
Note the password is
raspberry

A few extra commands to be aware of:

Note that to ssh into a Mint 18 distro from the pi, you need to install openssh-server on Mint first.
sudo apt-get install openssh-server
If you know the user and computer names that you want to ssh into, you can use them instead of hard-to-keep-track-of IP addresses.  If everything is on the local network (LAN).

Copy files with scp:
scp user@computer:desired.file ~/path/to/containing/folder
And on an *ahem* completely unrelated note, found a few more dependencies that sphinxtrain needed in the install script.  I'll update vmc when I get a chance.

Final post here

I'm switching over to github pages .  The continuation of this blog (with archives included) is at umhau.github.io .  By the way, the ...