`
mlzboy
  • 浏览: 703852 次
  • 性别: Icon_minigender_1
  • 来自: 北京
文章分类
社区版块
存档分类
最新评论

ruby 数据过滤相关操作

阅读更多
<!-- banner header -->

The Enumerable mixin provides collection classes with several traversal and searching methods, and with the ability to sort. The class must provide a method each, which yields successive members of the collection. If Enumerable#max, min, or sort is used, the objects in the collection must also implement a meaningful <=> operator, as these methods rely on an ordering between members of the collection.

Methods

all?   any?   collect   detect   each_cons   each_slice   each_with_index   entries   enum_cons   enum_slice   enum_with_index   find   find_all   grep   include?   inject   inject   map   max   member?   min   partition   reject   select   sort   sort_by   to_a   to_set   zip  
<!-- if includes -->

Classes and Modules

Class Enumerable::Enumerator
<!-- if method_list -->

Public Instance methods

Passes each element of the collection to the given block. The method returns true if the block never returns false or nil. If the block is not given, Ruby adds an implicit block of {|obj| obj} (that is all? will return true only if none of the collection members are false or nil.)

   %w{ ant bear cat}.all? {|word| word.length >= 3}   #=> true
   %w{ ant bear cat}.all? {|word| word.length >= 4}   #=> false
   [ nil, true, 99 ].all?                             #=> false

Passes each element of the collection to the given block. The method returns true if the block ever returns a value other than false or nil. If the block is not given, Ruby adds an implicit block of {|obj| obj} (that is any? will return true if at least one of the collection members is not false or nil.

   %w{ ant bear cat}.any? {|word| word.length >= 3}   #=> true
   %w{ ant bear cat}.any? {|word| word.length >= 4}   #=> true
   [ nil, true, 99 ].any?                             #=> true

Returns a new array with the results of running block once for every element in enum.

   (1..4).collect {|i| i*i }   #=> [1, 4, 9, 16]
   (1..4).collect { "cat"  }   #=> ["cat", "cat", "cat", "cat"]

Passes each entry in enum to block. Returns the first for which block is not false. If no object matches, calls ifnone and returns its result when it is specified, or returns nil

   (1..10).detect  {|i| i % 5 == 0 and i % 7 == 0 }   #=> nil
   (1..100).detect {|i| i % 5 == 0 and i % 7 == 0 }   #=> 35

Iterates the given block for each array of consecutive <n> elements.

e.g.:

    (1..10).each_cons(3) {|a| p a}
    # outputs below
    [1, 2, 3]
    [2, 3, 4]
    [3, 4, 5]
    [4, 5, 6]
    [5, 6, 7]
    [6, 7, 8]
    [7, 8, 9]
    [8, 9, 10]

Iterates the given block for each slice of <n> elements.

e.g.:

    (1..10).each_slice(3) {|a| p a}
    # outputs below
    [1, 2, 3]
    [4, 5, 6]
    [7, 8, 9]
    [10]

Calls block with two arguments, the item and its index, for each item in enum.

   hash = Hash.new
   %w(cat dog wombat).each_with_index {|item, index|
     hash[item] = index
   }
   hash   #=> {"cat"=>0, "wombat"=>2, "dog"=>1}

Returns an array containing the items in enum.

   (1..7).to_a                       #=> [1, 2, 3, 4, 5, 6, 7]
   { 'a'=>1, 'b'=>2, 'c'=>3 }.to_a   #=> [["a", 1], ["b", 2], ["c", 3]]

Passes each entry in enum to block. Returns the first for which block is not false. If no object matches, calls ifnone and returns its result when it is specified, or returns nil

   (1..10).detect  {|i| i % 5 == 0 and i % 7 == 0 }   #=> nil
   (1..100).detect {|i| i % 5 == 0 and i % 7 == 0 }   #=> 35

Returns an array containing all elements of enum for which block is not false (see also Enumerable#reject).

   (1..10).find_all {|i|  i % 3 == 0 }   #=> [3, 6, 9]

Returns an array of every element in enum for which Pattern === element. If the optional block is supplied, each matching element is passed to it, and the block‘s result is stored in the output array.

   (1..100).grep 38..44   #=> [38, 39, 40, 41, 42, 43, 44]
   c = IO.constants
   c.grep(/SEEK/)         #=> ["SEEK_END", "SEEK_SET", "SEEK_CUR"]
   res = c.grep(/SEEK/) {|v| IO.const_get(v) }
   res                    #=> [2, 0, 1]

Returns true if any member of enum equals obj. Equality is tested using ==.

   IO.constants.include? "SEEK_SET"          #=> true
   IO.constants.include? "SEEK_NO_FURTHER"   #=> false

Combines the elements of enum by applying the block to an accumulator value (memo) and each element in turn. At each step, memo is set to the value returned by the block. The first form lets you supply an initial value for memo. The second form uses the first element of the collection as a the initial value (and skips that element while iterating).

   # Sum some numbers
   (5..10).inject {|sum, n| sum + n }              #=> 45
   # Multiply some numbers
   (5..10).inject(1) {|product, n| product * n }   #=> 151200

   # find the longest word
   longest = %w{ cat sheep bear }.inject do |memo,word|
      memo.length > word.length ? memo : word
   end
   longest                                         #=> "sheep"

   # find the length of the longest word
   longest = %w{ cat sheep bear }.inject(0) do |memo,word|
      memo >= word.length ? memo : word.length
   end
   longest                                         #=> 5

Returns a new array with the results of running block once for every element in enum.

   (1..4).collect {|i| i*i }   #=> [1, 4, 9, 16]
   (1..4).collect { "cat"  }   #=> ["cat", "cat", "cat", "cat"]

Returns the object in enum with the maximum value. The first form assumes all objects implement Comparable; the second uses the block to return a <=> b.

   a = %w(albatross dog horse)
   a.max                                  #=> "horse"
   a.max {|a,b| a.length <=> b.length }   #=> "albatross"

Returns true if any member of enum equals obj. Equality is tested using ==.

   IO.constants.include? "SEEK_SET"          #=> true
   IO.constants.include? "SEEK_NO_FURTHER"   #=> false

Returns the object in enum with the minimum value. The first form assumes all objects implement Comparable; the second uses the block to return a <=> b.

   a = %w(albatross dog horse)
   a.min                                  #=> "albatross"
   a.min {|a,b| a.length <=> b.length }   #=> "dog"

Returns two arrays, the first containing the elements of enum for which the block evaluates to true, the second containing the rest.

   (1..6).partition {|i| (i&1).zero?}   #=> [[2, 4, 6], [1, 3, 5]]

Returns an array for all elements of enum for which block is false (see also Enumerable#find_all).

   (1..10).reject {|i|  i % 3 == 0 }   #=> [1, 2, 4, 5, 7, 8, 10]

Returns an array containing all elements of enum for which block is not false (see also Enumerable#reject).

   (1..10).find_all {|i|  i % 3 == 0 }   #=> [3, 6, 9]

Returns an array containing the items in enum sorted, either according to their own <=> method, or by using the results of the supplied block. The block should return -1, 0, or +1 depending on the comparison between a and b. As of Ruby 1.8, the method Enumerable#sort_by implements a built-in Schwartzian Transform, useful when key computation or comparison is expensive..

   %w(rhea kea flea).sort         #=> ["flea", "kea", "rhea"]
   (1..10).sort {|a,b| b <=> a}   #=> [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

Sorts enum using a set of keys generated by mapping the values in enum through the given block.

   %w{ apple pear fig }.sort_by {|word| word.length}
                #=> ["fig", "pear", "apple"]

The current implementation of sort_by generates an array of tuples containing the original collection element and the mapped value. This makes sort_by fairly expensive when the keysets are simple

   require 'benchmark'
   include Benchmark

   a = (1..100000).map {rand(100000)}

   bm(10) do |b|
     b.report("Sort")    { a.sort }
     b.report("Sort by") { a.sort_by {|a| a} }
   end

produces:

   user     system      total        real
   Sort        0.180000   0.000000   0.180000 (  0.175469)
   Sort by     1.980000   0.040000   2.020000 (  2.013586)

However, consider the case where comparing the keys is a non-trivial operation. The following code sorts some files on modification time using the basic sort method.

   files = Dir["*"]
   sorted = files.sort {|a,b| File.new(a).mtime <=> File.new(b).mtime}
   sorted   #=> ["mon", "tues", "wed", "thurs"]

This sort is inefficient: it generates two new File objects during every comparison. A slightly better technique is to use the Kernel#test method to generate the modification times directly.

   files = Dir["*"]
   sorted = files.sort { |a,b|
     test(?M, a) <=> test(?M, b)
   }
   sorted   #=> ["mon", "tues", "wed", "thurs"]

This still generates many unnecessary Time objects. A more efficient technique is to cache the sort keys (modification times in this case) before the sort. Perl users often call this approach a Schwartzian Transform, after Randal Schwartz. We construct a temporary array, where each element is an array containing our sort key along with the filename. We sort this array, and then extract the filename from the result.

   sorted = Dir["*"].collect { |f|
      [test(?M, f), f]
   }.sort.collect { |f| f[1] }
   sorted   #=> ["mon", "tues", "wed", "thurs"]

This is exactly what sort_by does internally.

   sorted = Dir["*"].sort_by {|f| test(?M, f)}
   sorted   #=> ["mon", "tues", "wed", "thurs"]

Returns an array containing the items in enum.

   (1..7).to_a                       #=> [1, 2, 3, 4, 5, 6, 7]
   { 'a'=>1, 'b'=>2, 'c'=>3 }.to_a   #=> [["a", 1], ["b", 2], ["c", 3]]

Makes a set from the enumerable object with given arguments. Needs to +require "set"+ to use this method.

Converts any arguments to arrays, then merges elements of enum with corresponding elements from each argument. This generates a sequence of enum#size n-element arrays, where n is one more that the count of arguments. If the size of any argument is less than enum#size, nil values are supplied. If a block given, it is invoked for each output array, otherwise an array of arrays is returned.

   a = [ 4, 5, 6 ]
   b = [ 7, 8, 9 ]

   (1..3).zip(a, b)      #=> [[1, 4, 7], [2, 5, 8], [3, 6, 9]]
   "cat\ndog".zip([1])   #=> [["cat\n", 1], ["dog", nil]]
   (1..3).zip            #=> [[1], [2], [3]]
分享到:
评论

相关推荐

    chunky_png:在纯Ruby中对PNG图像的读写访问

    矮胖PNG 该库可以读取和写入PNG文件。... 适用于当前支持的每个Ruby版本(2.5及更高版本) 如果确实需要,可以与RMagick进行互操作。 另外,请查看 ,它是一个mixin模块,该模块在C语言中实现了一些Ch

    logstash镜像

    一个logstash镜像,里面包含了在过滤过程中使用jdbc插件过滤数据,以及使用ruby进行高级匹配等操作

    Hadoop权威指南(中文版)2015上传.rar

    数据处理操作 加载和存储数据 过滤数据 分组与连接数据 对数据进行排序 组合和分割数据 Pig实战 并行处理 参数代换 第12章 Hive 1.1 安装Hive 1.1.1 Hive外壳环境 1.2 示例 1.3 运行Hive 1.3.1 配置Hive 1.3.2 Hive...

    Hadoop权威指南 第二版(中文版)

     数据处理操作  加载和存储数据  过滤数据  分组与连接数据  对数据进行排序  组合和分割数据  Pig实战  并行处理  参数代换 第12章 Hive  1.1 安装Hive  1.1.1 Hive外壳环境  1.2 示例  1.3 运行Hive ...

    nosql 入门教程

    2.1.1 简单的位置偏好数据集 17 2.1.2 存储汽车品牌和型号数据 22 2.2 使用多种语言 30 2.2.1 MongoDB驱动 30 2.2.2 初识Thrift 33 2.3 小结 34 第3章 NoSQL接口与交互 36 3.1 没了SQL还剩什么 36 3.1.1 ...

    dhtmlxGridk 中文文档

    dhtmlxGrid 启用Ajax的JavaScript网格控制与尖端功能,强大的数据绑定,并与大型数据集的出色表现 。组件是易于使用,并...dhtmlxGrid表格展现通过加载XML文件来载入表头、表尾、表格数据实现展现、操作、回写功能。

    JAVA上百实例源码以及开源项目源代码

     Java实现的FTP连接与数据浏览程序,实现实例化可操作的窗口。  部分源代码摘录:  ftpClient = new FtpClient(); //实例化FtpClient对象  String serverAddr=jtfServer.getText(); //得到服务器地址  ...

    贝岭的matlab的代码-db-nyc:使用谷歌地图和纽约市开放数据对纽约市噪声污染进行交互式热图可视化

    Ruby on Rails 创建的,并绘制了 311 噪音投诉、紧急服务、社交登记和社交场所评论的位置,以创建 24 小时内分解的噪音动态热图。 截图 dB NYC 的主页显示了通过汇总所有可用数据生成的热图。 可以通过选择特定的...

    deadly_serious:基于流的编程大师!

    需要Ruby 2.1和基于* nix的操作系统。 在Ubuntu和Arch Linux上进行了测试。 我为什么要在乎? 这是一个旨在并行处理,以时间顺序分布且内存占用较少的TONS数据的gem。 它还在实际系统中使用: 这个怎么运作 ...

    ASP EXCEL导入SQL

    这个原则是源自于我们对于数据库表的数据操作:(生)、select(见)、(变)和(灭),所以有时候CRUD也写作为RUDI,其中的I就是,这四个操作是一种原子操作,即一种无法再分的操作,通过它们可以构造复杂的操作过程,正如...

    rails_admin:RailsAdmin是一个Rails引擎,它提供了易于使用的界面来管理数据

    Rails管理员 RailsAdmin是一个Rails引擎,它... 用户操作历史记录(通过 ) 支持的ORM 活动记录 蒙古族 安装 在您的gemfile上: gem 'rails_admin', '~&gt; 2.0' 运行bundle install 运行rails g rails_admin:install

    Cucumber:行为驱动开发指南

    第三部分讲应用 ,基本上都是基于Ruby的一些库,但11章的命令行使用方式还是很有意义的, Cu...ber本身就是一个命令行工具,通过命令行,可以对特性文件进行一些过滤,对输出格式进行定制,以及集成到持续集成中。...

    JAVA上百实例源码以及开源项目

    百度云盘分享 ... Java实现的FTP连接与数据浏览程序,实现实例化可操作的窗口。  部分源代码摘录:  ftpClient = new FtpClient(); //实例化FtpClient对象  String serverAddr=jtfServer.getText();...

    java开源包1

    GWT Advanced Table 是一个基于 GWT 框架的网页表格组件,可实现分页数据显示、数据排序和过滤等功能! Google Tag Library 该标记库和 Google 有关。使用该标记库,利用 Google 为你的网站提供网站查询,并且可以...

    java开源包11

    GWT Advanced Table 是一个基于 GWT 框架的网页表格组件,可实现分页数据显示、数据排序和过滤等功能! Google Tag Library 该标记库和 Google 有关。使用该标记库,利用 Google 为你的网站提供网站查询,并且可以...

    java开源包2

    GWT Advanced Table 是一个基于 GWT 框架的网页表格组件,可实现分页数据显示、数据排序和过滤等功能! Google Tag Library 该标记库和 Google 有关。使用该标记库,利用 Google 为你的网站提供网站查询,并且可以...

    java开源包3

    GWT Advanced Table 是一个基于 GWT 框架的网页表格组件,可实现分页数据显示、数据排序和过滤等功能! Google Tag Library 该标记库和 Google 有关。使用该标记库,利用 Google 为你的网站提供网站查询,并且可以...

    java开源包6

    GWT Advanced Table 是一个基于 GWT 框架的网页表格组件,可实现分页数据显示、数据排序和过滤等功能! Google Tag Library 该标记库和 Google 有关。使用该标记库,利用 Google 为你的网站提供网站查询,并且可以...

Global site tag (gtag.js) - Google Analytics