I must admit that I misunderstood the task when I first saw it. The vendor had changed how they provide the data. We will no longer be able to retrieve the data using ftp – the data should be picked up with sftp.
I immediately thought that this would be quite the trick as as far as I was aware you cannot script sftp the same way you can ftp. I was discussing this with my boss to get yet another dose of information. My boss said that this seemed an odd choice of technology, that what would be a better choice would be to use ftps.
It was about this time that I realized that I made the first mistake of thinking about the programs not the protocols. Indeed there are three different protocols, FTP, SFTP and FTPS and there are two programs with the same name as the protocol. The second mistake was that I was not familiar with the FTPS protocol. First a bit about the protocols.
File Transfer Protocol (FTP)
The FTP protocol is to connect to the server with two connections. The first is the command connection which controls the transfers, and the second connection is the data connection. The process is controlled over one while the data is transferred over the second connection.
Although the setup of the ftp server can be anything the typical setup uses ports 20 and 21.
The ftp client connects from the client computer on port “X” to the server using TCP on port 21. Once the connection has been made a second connection will be used to for transferring the data. The way this second connection is made depends on the type of mode that is used – active or passive.
The second connection is actually made by the server from port 20 to port X + 1 on the client.
It is really simple, but this won’t usually work in practice at most serious companies as usually there are firewalls protecting the server from the internet. The firewalls only job is to prevent unknown computers to connect to the machine and thus would prevent the active ftp from working.
This problem was foreseen and thus the passive mode was also created.
Passive mode is exactly the same as the active mode except that the client opens the second connection instead of the server. It is because the client opens the second connection that usually eliminates the problem of the firewall interfering with the creation of the connection.
The coordination of how to keep the connections connected is pretty simple. When the ftp client initiates the passive mode with the command PASV, it receives the number of the port to connect to for the data connection.
- No size limitation on file transfers
- Some clients can be scripted
- Usernames, passwords and files are sent in clear text
- Filtering active FTP connections is difficult on your local machine (passive is preferred)
- Servers can be spoofed to send data to a random port on an unintended computer
The big disadvantage to the ftp protocol is that the user and password is communicated as clear text. It is possible for anyone sniffing packets to get this information.
FTPS (or FTPES or FTP-SSL)
The ftp program allows the user to transfer files and change directories on the remote computer. It is really useful. When the only real weakness of the file transfer protocol is that the users credentials are passed over in clear text, it seems small enough to correct.
Indeed that is exactly what was attempted with FTPS extensions to the file transfer protocol. The change was to simply add encryption to plug this particular weakness. So the change that was done was adding Transport Layer Security and Secure Sockets Layer encryption protocols.
- Provides services for server-to-server file transfer
- SSL/TLS uses X.509 certificate to authenticate
- Requires a DATA channel, which can make it hard to use behind the firewalls
- Doesn’t define a standard for file name character sets
- Not all FTP servers support SSL/TLS
Secure File Transfer Protocol (SFTP)
The SFTP protocol is also sometimes referred to as SSH File Transfer Protocol. The SFTP protocol is a network protocol that provides file access, file transfer and file management over a network connection.
All data that is transferred between the client and server, including login credentials, are encrypted. This is usually done through the user of public and private keys but can be done in addition to a user and password.
The file transfer protocol uses only a single connection over port 22 on the server. Both the commands and data transfer take place over this single connection.
- Only one connection is needed (no special DATA connection)
- The connection is always secured
- The protocol includes operations for permission and attribute manipulation, file locking and more functionality
- SSH keys are harder to manage and validate
Putting it all to good use
The file transfer protocol actually is not the most secure protocol due to the fact that the user credentials are sent over as clear text. This is actually important if you actually think that someone may be sniffing your packets (not so likely) but depending on the service this might not be so important.
If the data that is transferred to the FTP server is encrypted then it may not be as important if the username and password is captured. If on the destination server the data is processed and then removed this may not be a problem or if the data is really well encrypted this might not be a problem. Perhaps the information is public information and if the data escapes it is not important (ie. which days are public holidays for a specific trading calendar, what is the trading price of a stock or what is a company’s PE ratio)
A simple scripted solution using ftp
#!/bin/bash USER=richard PASS=secretpasscode DATA=/var/tmp/datafile.tar ftp -i -n ftp.somedomain.co.uk <<MARK user $USER $PASS pwd cd data bin hash put $DATA ls -ltr MARK echo file transferred
This is a small script, which despite not being very secure, makes a small connection and puts a data file. It is only possible because the input to the ftp command can be piped in. This is done in this clever little script be redirecting the data from the script itself.
It is not possible to do this same trick using the sftp program, but it is possible to create a script using secure copy.
#!/bin/bash USER=richard MACHINE=192.168.178.57 DATA=/var/tmp/datafile.tar scp $DATA $USER@$MACHINE:/tmp
This script is even smaller than the ftp script and is more clear to the casual reader.
The scp command actually still requires a password and is not “scriptable” in the same way that the ftp client was. However, it is possible to setup the user setup so the public/private key is used for authentication and password isn’t necessary.
ftp Scripting – Extra credit edition
For the really paranoid who must continue to use ftp, you might not want to connect directly to a server over the internet but instead connect via a proxy server. It is possible to do that using ftp. Simply pass all the information necessary for the proxy server.
#!/bin/bash #our destination machine USER=dick PASS=secretpasscode DEST=ftp.somedomain.co.uk #our proxy server PROXYUSER=bob PROXYPASS=secret PROXYMACH=myproxy.mydomain.com DATA=/var/tmp/datafile.tar ftp -i -n $PROXYMACH <<MARK user $USER@PROXYUSER@DEST $PASS@$PROXYPASS cd data bin hash put $DATA ls -ltr $DATA MARK echo file transferred
Although it is possible to add this level of complexity to buffer your server from the ills of the world, it really wouldn’t be more secure than to use secure copy (scp) for transferring the data files. Secure copy would have encrypted credentials and the key used as a “password” would most likely be considerably longer and more secure than any normal password.
Even if you picked a small 256bit key, it would on average still be better than some 16 or 20 character login. Yet, a more reasonable choice of a 1024 or 2048 bit key would be massively more secure than any password selected.
Nice explanation of active / passive ftp