Recent Changes - Search:

Distributed Computing

This website demonstrates using wikis as teaching and learning tool.

The course instructor is also happy to share the teaching materials here with those who find it readable.

More about Distributed File Systems

A Distributed Computing Lecture by Steven Choy

SMB and Samba

Introducing SMB

  • SMB is Microsoft’s protocol to share files and printers
  • Also renamed CIFS (Common Internet File System)
  • Client/Server, no location transparency
  • Not the same as Samba
    • an open source implementation of SMB primarily found on UNIX systems (Linux)
  • SMB usually runs on NetBIOS (naming + sessions + datagram)
NetBIOS (Network Basic Input/Output System): The NetBIOS API allows applications on separate computers to communicate over a local area network. In modern networks, it normally runs over TCP/IP, giving each computer in the network both a NetBIOS name and an IP address corresponding to a host name. (Source)
Windows Internet Name Service (WINS) is Microsoft's implementation of NetBIOS Name Server (NBNS) on Windows, a name server and service for NetBIOS computer names. Effectively, WINS is to NetBIOS names, what DNS is to domain names - a central mapping of host names to network addresses. (Source)
  • NetBIOS + SMB developed for LAN use
  • A number of other services run on top of SMB
    • In particular MS-RPC, a modified variant of DCE-RPC

What is Samba?

  • SMB/CIFS File Server
  • Authentication Server
  • Bridge between UNIX and Windows Networks
  • Runs under UNIX, VMS, Linux, FreeBSD, and more!

SMB Protocol

  • Request/response.
  • Runs atop TCP/IP.
  • E.g., file and print operations.
    • Open close, read, write, delete, etc.
    • Queuing/dequeing files in printer spool.

Samba: How does it work?

  • Set of UNIX applications running the Server Message Block (SMB) protocol.
    • SMB is the protocol MS Windows use for client-server interactions over a network.
    • By running SMB, Unix systems appear as another MS Windows system.

SMB Message

  • Header + command/response.
  • Header: protocol id, command code, etc.
  • Command: command parameters.

Establishing a SMB Connection

  • Establish TCP connection.
  • Negotiate protocol variant.
    • Client sends SMBnegprot.
    • Client sends lists of variants it can speak.
    • Server responds with index into client’s list.
  • Set session and login parameters.
    • Account name, passwd, workgroup name, etc.

A brief on setting up SAMBA

  • 1. Have samba installed.
      sudo aptitude install samba
  • 2. Configure samba
      sudo /etc/init.d/samba stop
      sudo gedit /etc/samba/smb.conf

          [global]

          workgroup = YOUR_WORKGROUP

          [MyFiles]

          path = /media/samba/
          browseable = yes
          read only = no
          guest ok = no
          create mask = 0644
          directory mask = 0755
          force user = YOUR_USERNAME
          force group = YOUR_USERGROUP
  • 3. Start samba and sett up user accounts
      sudo /etc/init.d/samba start
      sudo smbpasswd -L -a root
      sudo smbpasswd -L -e root
  • 4. Change network settings in Windows
    01 - Click "START"
    02 - Click "Control Panel"
    03 - Click "Network Connections"
    04 - Find your "LAN Connection"
    05 - Right-click the icon and select "Properties"
    06 - Select the "TCP/IP" Protocol and click the "Properties" button
    07 - Click "Advanced"
    08 - Select the third Tab entitled "WINS"
    09 - Click "Add"
    10 - Type in the ip-address of your Linux box
    11 - Click "Add"
    12 - Select "Use NetBIOS over TCP/IP"
    13 - Click "OK"
    14 - Click "OK"
    15 - Click "OK"
    16 - Reboot Windows

Some Revisions

Discuss and compare NFS (Network File System), AFS (Andrew File System) and SMB in terms of the following:

  • Access Transparency
  • Location Transparency
  • Scability
  • Client Caching
  • Client Cache Location
  • Cache Consistency Protocol
  • Cache Update Policy
  • Stateful vs. Stateless
  • File Replication (Server-side Cache)
  • Security
Client caching: The NFS client module caches the results of remote file operations (in memory) in order to reduce the number of requests transmitted to servers. AFS: once a copy of a file has been transferred to a client computer it is stored in a cache on the local hard disk.
Cache consistency protocol: NFS uses delayed write policy with cached file validity is checked periodically (30 sec). A timestamp is associated with the cached file for checking validity. AFS use callback promise to check whether the copy of a remote file in the client cache is updated or not. It also informs other clients when the file in the remote server is updated but not in client cache. Callback promise is used in the recovery of a workstation, too.

Discuss the main problems facing the design of a good distributed file system.

  • Transporting many files over the net can easily create sluggish performance and latency, network bottlenecks and server overload can result.
  • The security of data is another important issue: how can we be sure that a client is really authorized to have access to information and how can we prevent data being sniffed off the network?
  • Two further problems facing the design are related to failures. Often client computers are more reliable than the network connecting them and network failures can render a client useless.
  • Similarly a server failure can be very unpleasant, since it can disable all clients from accessing crucial information.

Filesystem in Userspace

Reference: FUSE: Filesystem in Userspace

Introduction

  • With FUSE it is possible to implement a fully functional filesystem in a userspace program.
  • Features include:
    • Simple library API
    • Simple installation (no need to patch or recompile the kernel)
    • Secure implementation
    • Userspace - kernel interface is very efficient
    • Usable by non privileged users
    • Runs on Linux kernels 2.4.X and 2.6.X
    • Has proven very stable over time

How does it work?

Some Examples of FUSE

A little demonstration

  • Step 1: Install s3fs
      apt-get install build-essential libcurl4-openssl-dev libxml2-dev libfuse-dev
      wget http://s3fs.googlecode.com/files/s3fs-r177-source.tar.gz
      tar -xzf s3fs*
      cd s3fs
      make
      make install
      mkdir /mnt/s3
  • Step 2: Mount
      /usr/bin/s3fs mys3bucket -o accessKeyId=xxx -o secretAccessKey=yyy /mnt/s3
  • Step 3: Unmount
      /bin/umount /mnt/s3

Large Scale File Storage and Serving

Reference: Beyond the File System: Designing Large-Scale File Storage and Serving - A presentation by Cal Henderson about designing large scale file storage and serving.

Big file systems requirements

  • Scalable
    • Looking at storage and serving infrastructures
  • Reliable
    • Looking at redundancy, failure rates, on the fly changes
  • Economic
    • Looking at upfront costs, TCO (total cost of ownership) and lifetimes

Four main factors

  • Storage
  • Serving
  • BCP (Business Continuity Planning)
  • Cost

Some Real Life Examples

It is similar to the Andrew File System in its design goal. It is different from a traditional filesystem in that the user has to access files via an API. However, it is possible to implement the file system in user space using FUSE or a similar package. A typical MogileFS cloud consists of a database, some number of trackers, and some number of storage units.
An illustration: Diagram 1,
  • Amazon S3

Extra Materials for Probing Further

在一般的區域網路中 (LAN) 如果都是 Windows 電腦,那麼使用『網路上的芳鄰』這個功能,就可以讓不同的 Windows 電腦分享彼此的檔案囉!但萬一這個 LAN 裡面有個 Linux 主機時,我怎麼讓 Linux 也加入這個 Windows 電腦當中的『網路上的芳鄰』呢?也就是說,讓 Windows 電腦可以透過『網路上的芳鄰』來存取 Linux 主機上面的檔案!呵呵!那就是 SAMBA 這個伺服器的主要目的了!
As many fellow Ubuntu users seem to have trouble setting up samba peer-to-peer with Windows I decided to write a small howto on this matter. NOTE: I am aware that there's a wiki-page as well as several other howto's around - but by looking at the constant "how do I setup samba" posts that are floating around in the forum I simply see the need for a more thourough guide on this matter.
Samba is a set of tools to share files and printers with computers running Windows. It implements the SMB network protocol, which is the heart of Windows networking.
Samba is an Open Source/Free Software suite that has provided file and print services to all manner of SMB/CIFS clients, including the numerous versions of Microsoft Windows operating systems. Samba is freely available under the GNU General Public License.
在 Windows 環境中,我們通常透過網路芳鄰來達成彼此機器間的資源分享工作,但是若想要與 Unix-Based 作業系統間做到資源分享,就會比較困難。一般在 Windows 與 Unix-Based 作業系統間,我們都會利用 FTP 或 NFS 來做檔案交換的工作,但這只能做到單方面資源的要求,而無法達到雙向互通。幸好 Samba Server 的出現可以解決這個難題。
This tutorial explains how to turn an old PC with additional hard disks into a simple home file server. The file server is intended for home use. The home file server is accessible by Windows and Linux computers in the home network.

Thanks for Reading

If you would rather like to have this lecture note in printed format, please click the print action link in the top right corner.

If you find any problem in this lecture note, please feel free to reach Steven by steven@findaway.hk

Edit - History - Print - Recent Changes - Search
Page last modified on March 14, 2010, at 03:08 PM