Ticker

6/recent/ticker-posts

Migration of files from AS400 and Windows server to AWS S3 bucket using AWS SnowBallEdge with Linux Workstation

Connectivity Diagram


Usecase: 

Customer wanted to move around 40TB of data from on-premise to AWS, Data were present in AS400 server and a Windows server, We had tried to use data sync service however it was very slow due to the fact that connectivity between on-premise and AWS was via direct connect lower bandwidth, hence we decided to move forward with AWS Snowball device.


Assumption:

You have ordered two snowball edge devices and it has reached to your corporate office. We ordered 2 snow devices since we wanted to migrate data parallely & quickly.

Q . What is AS400?

A . AS400 is IBM's legacy computer, companies are moving away from AS400 system and adopting to cloud technology. 
 

Q . Why did we choose Intermediary Linux Workstation?

A . AWS recommends using Linux as an intermediary workstation and also since almost all files are less than 2 MB so in order to avoid performance issues we have gone for Linux workstation, Also the fact that AS400 does not have built in Linux shell to run general Linux command.


Q .Why did we choose Snowball Edge storage optimized?

A. Since we mostly have small files, we will be transferring small files batching them together in a single archive. Also, Snowball Edge automatically extracts the contents of the archived files when the data is imported into Amazon S3.(ref)



Steps to perform migration:

Build Linux Workstation:

  • Install any Linux OS on two physical machines, In this case we had ubuntu installed.
  • Install NFS client on both of the workstations.
  • Assign static IP to both the workstation. 
    • first workstation’s IP address was 192.168.1.118 i.e corp-w1-nfs.corp.net
    • second workstation’s IP address was 192.168.1.119 i.e corp-w2-nfs.corp.net
  • AS400 IP address as 192.168.1.112 
  • Windows server IP as 192.168.1.46 
  • first snow device IP address as 192.168.1.20(Assumed)
  • second snow device IP address as 192.168.1.21(Assumed)

DNS Configuration:

This step is prerequisite of NFS to work with AS400 system properly.
  • Add DNS entries for hostnames:
    • corp-w1-nfs.corp.net -> 192.168.1.118
    • corp-w2-nfs.corp.net -> 192.168.1.119

AS400 and Windows Server Configuration:

Configure NFS export on AS400 and Windows Server with workstation IP addressess.


Software Installation on Linux Workstations:

$sudo apt upgrade -y
$sudo apt install nfs-common telnet nload inetutils-traceroute traceroute python2.7 unzip -y 

Install AWS CLI:

$curl "https://s3.amazonaws.com/aws-cli/awscli-bundle-1.16.14.zip" -o "awscli-bundle.zip"
$
unzip awscli-bundle.zip
$
sudo ./awscli-bundle/install -i /usr/local/aws -b /usr/local/bin/aws

Mount AS400 and Windows Server Directories:

$mkdir /IMAGE
$chmod -R 755 /IMAGE
$
mount -t nfs 192.168.1.112:/IMAGE/ /IMAGE/

$mkdir /AS400
$chmod -R 755 /AS400
$mount -t nfs 192.168.1.xxx/AS400/ /AS400

Add mount entry to /etc/fstab, something like following:

192.168.1.112:/IMAGE /IMAGE nfs rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=12,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.1.102,mountvers=3,mountport=62870,mountproto=udp,local_lock=none,addr=192.168.1.102 0 0
Note: likewise of AS400 directory as well
Upon mounting the filesystem to workstations, just run simple cp command to check if it is working properly

Prepare AWS Snow Device:

Download and install Snowball Edge Client:
$cd /usr/local
$
wget https://snowball-client.s3.us-west-2.amazonaws.com/latest/snowball-client-linux.tar.gz
$
tar -xvf snowball-client-linux.tar.gz
$
echo "export PATH=$PATH:/usr/local/snowball-client-linux-1.2.0-330/bin" >> /etc/profile
$
source /etc/profile

Job Preparation for AWS Snow Device:
  • Prepare directories for transfer to AWS Snow Device.
  • Follow AWS recommendations for batch size and file size.
    • AWS recommends that you limit your batches to about 10,000 files. (ref)
    • Make sure the batched file is no larger than 100 GB.
    • Default quota of AWS snowball edge devices is 1 device per account, if you are ordering more than 1 then request the AWS support team to increase the quota with proper justification.
    • Available 2 RJ45 10GB cables
  • Plug in the RJ45 cable & Power cable to Snow device, make sure all devices(Snow device, Linux Workstations, AS400 and Windows server are in the same local network.
  • Now setup the network and assign an IP to it. IP can be assigned via DHCP or statically (preferably assign static IP so that it remains same till we transfer the all the files to snow device)
  • Go to AWS console and open the job for your Snow device and click on Get credentials. You will see an unlock code and download the manifest and copy it to the both workstations.
  • Configuring a Profile for the Snowball Edge Client on both workstations for snowball devices.

Setting up profile for 1st snow device

# snowballEdge configure –profile sbe1
  • You will need to provide the manifest path that you downloaded from AWS console, the unlock code also provided in the AWS console and the endpoint for the Snowcone visible in the screen. For the endpoint make sure you add with https://
Default Endpoint [https://192.168.1.20]: https://192.168.1.20
  • 192.168.1.20 is the assumed IP address of the first Snow device.
  • Likewise configure another profile for 2nd snow device:
# snowballEdge configure –profile sbe2
  • Try pinging both snow devices
# ping 192.168.1.20
  • Now unlock the Snow devices with their respective profile
# snowballEdge unlock-device  -–profile sbe1
  • Next, describe the device with their respective profile
# snowballEdge describe-device -–profile sbe1
  • It can take some time to unlock
  • Now set up profile for accessing s3 bucket
# snowballEdge list-access-keys -–profile sbe1

Expected output:
{
   
"AccessKeyIds" : [ "AKIAIOSFODNN7EXAMPLE" ]
}
  • Now get the secret access key
# snowballEdge get-secret-access-key --access-key-id AKIAIOSFODNN7EXAMPLE -–profile sbe1 

Example Output
[snowballEdge]
aws_access_key_id = AKIAIOSFODNN7EXAMPLE
aws_secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
  • Configure the AWS cli with the above output and use the profile while sending data to AWS s3 bucket.
# aws configure --profile sbe1
AWS Access Key ID [None]: AKIAIOSFODNN7EXAMPLE
AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Default region name [None]: snow
Default output format [None]: json

Note - You must specify the default region as snow and above commands will be run for both the devices.
  • Now run the following script

Optimize for large files configuration:

  • If you need to transfer numerous large files, consider dividing bigger files into smaller segments. This approach enhances the number of threads available for parallelized object transfers.
  • When using s3 cp or s3 sync to transfer data to Snowball Edge, you can fine-tune the AWS CLI by modifying the configuration file located in ~/.aws/config.
  • s3 =
        max_concurrent_requests = 30
        multipart_threshold = 32MB
        multipart_chunksize = 32MB
  • The AWS S3 transfer commands support multi-threading, providing an opportunity to optimize the transfer of large files through the configuration of the following parameters:
    • max_concurrent_requests: This parameter determines the maximum number of concurrent requests allowed simultaneously. The default value is 10, but for optimal throughput, it is recommended to set this parameter as high as your connection can handle. As a best practice, keep the configured value equal to or below 40.
    • multipart_chunksize: This parameter defines the size of each part in a multipart upload for an individual file. By adjusting this setting, you can break down larger files into smaller parts, improving upload speeds. The default value is 8 MB. It's important to find a balance between the part file size and the total number of parts. Note that a multipart upload requires uploading a single file in <= 10,000 distinct parts.
    • multipart_threshold: This parameter sets the size threshold for multipart transfers of individual files. The default value is 8 MB.
    • By appropriately configuring these parameters, you can enhance the efficiency of large file transfers on AWS S3.

Outbound Ports to be Opened on Workstation

Port Protocol Comment
22 (HTTP) TCP SSH
2049 (HTTP) TCP NFS endpoint
9091 (HTTP) TCP Endpoint for device management
111 TCP/UDP NFS endpoint
8080 TCP HTTP endpoint for Amazon S3 on Snowball Edge
8443 TCP HTTPS endpoint for Amazon S3 on Snowball Edge


 Turning Off the Snowball Edge

  • Once you have completed the data transfer to the AWS Snowball Edge device, it's time to prepare it for its return journey to AWS. Before proceeding, ensure that all data transfers to the device have concluded. If you utilized the file interface for data transfer, disable it before powering off the device. Refer to the documentation on Disabling the File Interface for more details.
  • After confirming that all communication with the device has ceased, power it off by pressing the power button located above the LCD display. It typically takes approximately 20 seconds for the device to complete the shutdown process.


Returning the Snowball Edge Device

  • The E Ink display features a prepaid shipping label that contains the accurate address for returning the AWS Snowball Edge device.



References:

Post a Comment

0 Comments