http://www.meichua.com (收藏,设为首页)
以前打牌的时候出牌太慢,其他人就叫着出啊,出啊,所以就变成了chua...... (手机请访问 http://3g.dlog.cn/meichua)
上一篇:搬家后之感想 下一篇:地震

hbase的搭建

2008年5月6日(Tuesday) 14点35分 作者: chua 天气: 心情: 一般

hbase的搭建
URL:http://hadoop.apache.org/hbase/docs/r0.1.1/api/overview-summary.html
在已经创建的hdfs基础上搭建
1:修改hadoop/contrib/hbase/conf/hbase-env.sh
加入java_home的路径

2:修改hadoop/contrib/hbase/conf/hbase-site.xml,加入如下
  <property>
    <name>hbase.master</name>
    <value>10.0.4.121:11100</value>
    <description>The host and port that the HBase master runs at.</description>
  </property>
  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://10.0.4.121:10100/hbase</value>
    <description>The directory shared by region servers.</description>
  </property>
3:启动hbase
hadoop/contrib/hbase/bin/start-hbase.sh

4: 查看http://wiki.apache.org/hadoop/Hbase/HbaseShell,进行shell操作
4.1 首先进入shell, hadoop/contrib/hbase/bin/hbase shell
4.2 创建表
 create table offer(image_big,image_small);
4.2 插入数据,查询,删除数据
  如:
  insert into offer(image_big:,image_small:) values ('abcdefg','abc') where row = 'testinsert';
  insert into offer(image_big:,image_small:) values ('hijklmn','hij') where row = 'testinsert';
  insert into offer(image_big:content,image_big:path,image_small:content,image_small:path) values ('abcdefg','path_big','abc','path_small') where row = 'testinsert';
  insert into offer(image_big:content,image_big:path,image_small:content,image_small:path) values ('hijklmn','path_big','hij','path_small') where row = 'testinsert';
 
  select * from offer where row = 'testinsert';
 +------------------------+-------------------------+
 | Column                 | Cell                    |
 +------------------------+-------------------------+
 | image_big:             | hijklmn                 |
 +------------------------+-------------------------+
 | image_big:content      | hijklmn                 |
 +------------------------+-------------------------+
 | image_big:path         | path_big                |
 +------------------------+-------------------------+
 | image_small:           | hij                     |
 +------------------------+-------------------------+
 | image_small:content    | hij                     |
 +------------------------+-------------------------+
 | image_small:path       | path_small              |
 +------------------------+-------------------------+
 
  select count(*) from offer where row = 'testinsert';
 1 row(s) in set. (0.02 sec)
 从上可以看到,虽然我们插入了4条数据,但是结果是1,hbase覆盖了相同的数据,insert2覆盖insert1,insert4覆盖insert2,相当于update,从shell的介绍中我们也看到hql没有提供update
 此时的数据结果应该如下:
 +----------+--------------------------+---------------------------+
 |          |  Column   image_big      |      Column image_small   |
 |   key    +--------------------------+---------------------------+
 |          |   :   |:content | :path  |  :  |:content|  :path     |
 +-------------------------------------+---------------------------+
 |testinsert|hijklmn|hijklmn  |path_big| hij |  hij   |  path_small|
 +----------+--------------------------+---------------------------+
 加入insert加入TIMESTAMP会怎么样呢?
 delete * from offer where row = 'testinsert';
 
 insert into offer(image_big:,image_small:) values ('abcdefg','abc') where row = 'testinsert' timestamp '1209982310285';
  insert into offer(image_big:,image_small:) values ('hijklmn','hij') where row = 'testinsert' timestamp '1209982311285';
  insert into offer(image_big:content,image_big:path,image_small:content,image_small:path) values ('abcdefg','path_big','abc','path_small') where row = 'testinsert' timestamp '1209982312285';
  insert into offer(image_big:content,image_big:path,image_small:content,image_small:path) values ('hijklmn','path_big','hij','path_small') where row = 'testinsert' timestamp '1209982313285';
 
  结果无论是
  select * from offer where row = 'testinsert' 
  or select * from offer where row = 'testinsert' timestamp '1209982310285';
  都只返回
  +-------------------------+----------------------+
  | Column                  | Cell                 |
  +-------------------------+----------------------+
  | image_big:              | hijklmn              |
  +-------------------------+----------------------+
  | image_big:content       | hijklmn              |
  +-------------------------+----------------------+
  | image_big:path          | path_big             |
  +-------------------------+----------------------+
  | image_small:            | hij                  |
  +-------------------------+----------------------+
  | image_small:content     | hij                  |
  +-------------------------+----------------------+
  | image_small:path        | path_small           |
  +-------------------------+----------------------+
 
  我迷惑了,如hbase Architecture介绍中是有timestamp的,数据按照时间备份的.但这里怎么理解哦...
  http://www.mail-archive.com/core-user@hadoop.apache.org/msg00222.html,上面的页面中说到似乎目前还不支持,但是我这里插入是成功的;另外个人理解row和timestamp从数据结果上来说都是index级的,应该是数据本身之外的,那么不显示倒是没啥问题,但是数据好像被覆盖呢?难道目前不支持......
  先delete
  delete * from offer where row = 'testinsert'; 
  再select
  select * from offer where row = 'testinsert';
  +-------------------------+----------------------+
  | Column                  | Cell                 |
  +-------------------------+----------------------+
  | image_big:              | abcdefg              |
  +-------------------------+----------------------+
  | image_big:content       | abcdefg              |
  +-------------------------+----------------------+
  | image_big:path          | path_big             |
  +-------------------------+----------------------+
  | image_small:            | abc                  |
  +-------------------------+----------------------+
  | image_small:content     | abc                  |
  +-------------------------+----------------------+
  | image_small:path        | path_small           |
  +-------------------------+----------------------+ 
 这个意外的发现,说明数据是有备份的,是不过没有搜索到历史数据,select中的timestamp条件好像没有起作用,每次返回都是最新的数据.架构中说道insert如果没有时间条件,系统默认会加上当前时间.
 
5 client访问hbase
   如上次访问HDFS,引入hbase-site.xml,lib包,代码如下
  
  package com.chua.hadoop.client;
  
  import java.io.BufferedInputStream;
  import java.io.BufferedOutputStream;
  import java.io.DataInputStream;
  import java.io.File;
  import java.io.FileInputStream;
  import java.io.FileOutputStream;
  import java.io.IOException;
  import java.util.Iterator;
  import java.util.SortedMap;
  
  import org.apache.commons.httpclient.HttpClient;
  import org.apache.commons.httpclient.methods.GetMethod;
  import org.apache.hadoop.hbase.HBaseConfiguration;
  import org.apache.hadoop.hbase.HTable;
  import org.apache.hadoop.io.Text;
  
  /**
   * 类HBase.java的实现描述:TODO 类实现描述
   * @author chua 2008-5-4 下午05:03:33
   */
  public class HBase {
  
      /**
       * @param args
       */
      public static void main(String[] args) throws Exception {
          String domain = "www.dlog.cn";
          String path_s = "/uploads/m/me/meichua/meichua_100.jpg";
          String path_b = "/uploads/m/me/meichua/200804/22094433_tLuyw.jpg";
          byte[] data_s = getData(domain, path_s);
          byte[] data_b = getData(domain,path_b);
  
          HBaseConfiguration config = new HBaseConfiguration();
          HTable table = new HTable(config, new Text("offer"));
          createRecore(table,"chua","image_big",data_b,path_b);
          createRecore(table,"chua","image_small",data_s,path_s);
         
          //取得一个row的所有data,遍历keySet
          SortedMap map = table.getRow(new Text("chua"));
          if(!map.isEmpty()) {
              Iterator it = map.keySet().iterator();
              while(it.hasNext()){
                  System.out.println(it.next());
              }
          }
          //取得某个row的colunmName的data
          byte[] data = table.get(new Text("chua"), new Text("image_big:content"));
          saveAsFile(data,"c:/chua_big.jpg");
      }
  
      public static void createRecore(HTable table,String row, String colunm,byte[] data, String path) throws IOException {
          long lockId = table.startUpdate(new Text(row));
          table.put(lockId, new Text(colunm+":content"), data);
          table.put(lockId, new Text(colunm+":path"), path.getBytes());
          table.commit(lockId);
      }
     
      /**
       * 从网上读取图片
       * @param domain
       * @param path
       * @return
       */
      public static byte[] getData(String domain,String path){
          byte[] dataResource = null;
          try {
              HttpClient client = new HttpClient();
              client.getHostConfiguration().setHost(domain,80,"http");
              GetMethod getMethod = new GetMethod(path);
              int status = client.executeMethod(getMethod);
              if(status == 200) {
                  dataResource = getMethod.getResponseBody();
              }
              getMethod.releaseConnection();
          } catch(Exception e) {  
              System.out.println("Download error"+e);
          }
          return dataResource;
      }
     
      /**
       * 从本地文件读取
       * @param path
       * @return
       */
      public static byte[] getData(String path) {
          File file = new File(path);
          DataInputStream dis = null;
          try {
              dis = new DataInputStream(new BufferedInputStream(new FileInputStream(file)));
              int length = dis.available();
              byte[] data = new byte[length];
              dis.read(data);
              return data;
          } catch (Exception e) {
              e.printStackTrace();
              return null;
          }
      }
  
      /**
       * 存到一个文件
       * @param data
       * @param path
       */
      public static void saveAsFile(byte[] data,String path) {
          if(data != null) {
              try {
                  BufferedOutputStream out = new BufferedOutputStream(new FileOutputStream(path));
                  for(byte tmp : data) {
                      out.write(tmp);
                  }
                  out.close();
              } catch (Exception e) {
                  e.printStackTrace();
              }
          }
      }
  }
输出:
image_big:content
image_big:path
image_small:content
image_small:path
以上是一个client访问hbase的例子,比较简单


6 hbase架构介绍 

http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture

标签: hbase hadoop java 
姓名: 
邮箱:  {可选}
网址:  {可选} 此评论只有我和写日记的人查阅
校验码: ... <我看不清楚>
网记为您提供手机和互联网同步的个人主页,带给你不一样的体验