Learning Perl: 6.1. What Is a Hash?

Previous Page

Next Page

 

6.1. What Is a Hash?

A hash is a data structure like an array,in that it can hold any number of values and retrieve these values at will. However,instead of indexing the values by number,as we did with arrays,we'll look up the values by name. That is,the indices (here,we'll call them keys) aren't numbers but are arbitrary unique strings (see Figure 6-1).

The keys are strings,first of all,so instead of getting element number 3 from an array,we'll be accessing the hash element named wilma.

These keys are arbitrary stringsyou can use any string expression for a hash key. And they are unique strings; as there's only one array element numbered 3,there's only one hash element named wilma.

Another way to think of a hash is that it's like a barrel of data (see Figure 6-2),where each piece of data has a tag attached. You can reach into the barrel and pull out any tag and see what piece of data is attached. But there's no "first" item in the barrel; it's just a jumble. In an array,we'd start with element 0 and then element 1,element 2,and so on. But in a hash,there's no fixed order,no first element. It's just a collection of key/value pairs.

 

Figure 6-1. Hash keys and values

 


 

Figure 6-2. A hash as a barrel of data

 


The keys and values are both arbitrary scalars,but the keys are always converted to strings. So,if you used the numeric expression 50/20 as the key,[*] it would be turned into the three-character string "2.5",which is one of the keys shown in the diagram above.

[*] That's a numeric expression,not the five-character string "50/20". If we used that five-character string as a hash key,it would stay the same five-character string.

As usual,Perl's no-unnecessary-limits philosophy applies: a hash may be of any size,from an empty hash with zero key/value pairs,up to whatever fills up your memory.

Some implementations of hashes (such as in the original awk language,from where Larry borrowed the idea) slow down as the hashes get larger. This is not the case in Perlit has a good,efficient,scalable algorithm.[*] So,if a hash has only three key/value pairs,it's quick to "reach into the barrel" and pull out any one of those. If the hash has 3 million key/value pairs,it should be about as quick to pull out any one of those. A big hash is nothing to fear.

[*] Technically,Perl rebuilds the hash table as needed for larger hashes. The term "hashes" comes from the fact that a hash table is used for implementing them.

Keys are unique,though the values can be duplicated. The values of a hash may be all numbers,strings,undef values,or a mixture,[

] but the keys are arbitrary,unique strings.

[

] Or,in fact,any scalar values,including scalar types other than the ones we'll see in this book.

6.1.1. Why Use a Hash?

When you first hear about hashes,especially if you've lived a long and productive life as a programmer using languages that don't have hashes,you may wonder why anyone would want one of these strange beasts. Well,the general idea is that you'll have one set of data "related to" another set of data. For example,here are some hashes you might find in typical applications of Perl:


 

Given name,family name

 

The given name (first name) is the key,and the family name is the value. This requires unique given names; if two people were named randal,this wouldn't work. With this hash,you can look up anyone's given name,and find the corresponding family name. If you use the key tom,you get the value phoenix.


 

Hostname,IP address

 

You may know that each computer on the Internet has a hostname (such as www.stonehenge.com) and an IP address number (such as 123.45.67.89) because machines like working with the numbers,but we humans have an easier time remembering the names. The hostnames are unique strings,so they can be used to make this hash. With this hash,you could look up a hostname and find the corresponding IP address.


 

IP address,hostname

 

Or you could go in the opposite direction. We generally think of the IP address as a number,but it's also a unique string,so it's suitable for use as a hash key. In this hash,we can use the IP address to look up the corresponding hostname. This is not the same hash as the previous example: hashes are a one-way street,running from key to value; there's no way to look up a value in a hash and find the corresponding key. So,these two are a pair of hashes,one for storing IP addresses,one for hostnames. It's easy enough to create one of these given the other,though,as we'll see below.


 

Word,count of number of times that word appears

 

This is a common use of a hash. It's so common,that it just might turn up in the exercises at the end of the chapter.

The idea here is that you want to know how often each word appears in a given document. Perhaps you're building an index to a number of documents,so when a user searches for fred,you'll know that a certain document mentions fred five times,another mentions fred seven times,and another doesn't mention fred at all. The index shows which documents the user is likely to want. As the index-making program reads through a given document,each time it sees a mention of fred,it adds one to the value filed under the key of fred. That is,if we had seen fred twice already in this document,the value would be 2,but now we'll increment it to 3. If we had never seen fred before,we'd change the value from undef (the implicit,default value) to 1.


 

Username,number of disk blocks they are using [wasting]

 

System administrators like this one. The usernames on a given system are unique strings,so they can be used as keys in a hash to look up information about that user.


 

Driver's license number,name

 

There may be many people named John Smith,but we hope each one has a different driver's license number. That number makes for a unique key,and the person's name is the value.

Another way to think of a hash is as a simple database,in which one piece of data may be filed under each key. If your task description includes phrases like "finding duplicates," "unique," "cross-reference," or "lookup table," it's likely that a hash will be useful in the implementation.

Previous Page

Next Page

版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。

相关推荐


1. 如何去重 #!/usr/bin/perl use strict; my %hash; while(<>){ chomp; print "$_n" unless
最近写了一个perl脚本,实现的功能是将表格中其中两列的数据进行拼凑,然后将拼凑后的数据用“|”连接在一起。表格内容如下: 员工号码员工姓名职位入职日期1001张三销售1980/12/17 0:00:
表的数据字典格式如下:如果手动写MySQL建表语句,确认麻烦,还不能保证书写一定正确。写了个Perl脚本,可快速构造MySQL脚本语句。脚本如下:#!/usr/bin/perluse strict;m
巡检类工作经常会出具日报,最近在原有日报的基础上又新增了一个表的数据量统计日报,主要是针对数据库中使用较频繁,数据量又较大的31张表。该日报有两个sheet组成,第一个sheet是数据填写,第二个sh
在实际生产环境中,常常需要从后台日志中截取报文,报文的形式类似于.........一个后台日志有多个报文,每个报文可由操作流水唯一确定。以前用AWK写过一个,程序如下:beginline=`awk &
最近写的一个perl程序,通过关键词匹配统计其出现的频率,让人领略到perl正则表达式的强大,程序如下:#!/usr/bin/perluse strict;my (%hash,%hash1,@arra
忍不住在 PerlChina 邮件列表中盘点了一下 Perl 里的 Web 应用框架(巧的是 PerlBuzz 最近也有一篇相关的讨论帖),于是乎,决定在我自己的 blog 上也贴一下 :) 原生 CGI/FastCGI 的 web app 对于较小的应用非常合适,但稍复杂一些就有些痛苦,但运行效率是最高的 ;) 如果是自己用 Perl 开发高性能的站,多推荐之。 Catalyst, CGI::A
bless有两个参数:对象的引用、类的名称。 类的名称是一个字符串,代表了类的类型信息,这是理解bless的关键。 所谓bless就是把 类型信息 赋予 实例变量。 程序包括5个文件: person.pm :实现了person类 dog.pm :实现了dog类 bless.pl : 正确的使用bless bless.wrong.pl : 错误的使用bless bless.cc : 使用C++语言实
gb2312转Utf的方法: use Encode; my $str = "中文"; $str_cnsoftware = encode("utf-8", decode("gb2312", $str));   Utf转 gb2312的方法: use Encode; my $str = "utf8中文"; $str_cnsoftware = encode("gb2312", decode("utf-8
  perl 计算硬盘利用率, 以%来查看硬盘资源是否存在IO消耗cpu资源情况; 部份代码参考了iostat源码;     #!/usr/bin/perl use Time::HiRes qw(gettimeofday); use POSIX; $SLEEPTIME=3; sub getDiskUtl() { $clock_ticks = POSIX::sysconf( &POSIX::_SC_
1 简单变量 Perl 的 Hello World 是怎么写的呢?请看下面的程序: #!/usr/bin/perl print "Hello World" 这个程序和前面 BASH 的 Hello World 程序几乎相同,只是第一行换成了 #!/usr/bin/perl ,还有显示的时候用的是 print,而不是 echo。有了前面 BASH 基础和 C 语言的基础,许多 Perl 的知识可以很
本文介绍Perl的Perl的简单语法,包括基本输入输出、分支循环控制结构、函数、常用系统调用和文件操作,以及进程管理几部分。 1 基本输入输出 在 BASH 脚本程序中,我们用 read var 来实现从键盘的输入,用 echo $var 来实现输出。那么在 Perl 中将有一点变化。Perl 中将标准输入用关键词 表示;标准输出用 表示,标准错误输出用 表示。故而从标准输入读取数据可以写成: $
正则表达式是 Perl 语言的一大特色,也是 Perl 程序中的一点难点,不过如果大家能够很好的掌握他,就可以轻易地用正则表达式来完成字符串处理的任务,当然在 CGI 程序设计中就更能得心应手了。下面我们列出一些正则表达式书写时的一些基本语法规则。 1 正则表达式的三种形式 首先我们应该知道 Perl 程序中,正则表达式有三种存在形式,他们分别是: 匹配:m/<regexp>/ (还可以简写为 /
在学习Perl和Shell时,有很多人可能会问这样一个问题,到底先学习哪个或者学习哪个更好! 每个人都有自己的想法,以下是个人愚见,请多多指教! Perl是larry wall为解决日常工作中的一个编程问题而产生的,它最初的主要功能是用于分析基于文本的数据和生成这些数据的统计和结果;尽管初衷很简单,但是后来发展了很多特点: 1、Perl是一种借鉴了awk、C、sed、shell、C++、Java等
Perl 有很多命令行参数. 通过它, 我们有机会写出更简单的程序. 在这篇文章里我们来了解一些常用的参数. (重点提示:在window下执行命令行程序的方式为 : perl -e "some code", 注意:是双引号啊,不是单引号,linux下执行时单引号) Safety Net Options 在使用 Perl 尝试一些聪明( 或 stupid) 的想法时, 错误难免会发生. 有经验的 P
转自: http://bbs.chinaunix.net/thread-1191868-1-1.html# 让你的perl代码看起来更像perl代码,而不是像C或者BASIC代码,最好的办法就是去了解perl的内置变量。perl可以通过这些内置变量可以控制程序运行时的诸多方面。 本文中,我们一起领略一下众多内置变量在文件的输入输出控制上的出色表现。 行计数 我决定写这篇文章的一个原因就是,当我发现
2009-02-02 13:07 #!/usr/bin/perl # D.O.M TEAM - 2007 # anonyph; arp; ka0x; xarnuz # 2005 - 2007 # BackConnectShell + Rootlab t00l # priv8! # 3sk0rbut0@gmail.com # # Backconnect by data cha0s (modifica
转自: http://bbs.chinaunix.net/thread-1191868-1-1.html# 让你的perl代码看起来更像perl代码,而不是像C或者BASIC代码,最好的办法就是去了解perl的内置变量。perl可以通过这些内置变量可以控制程序运行时的诸多方面。 本文中,我们一起领略一下众多内置变量在文件的输入输出控制上的出色表现。 行计数 我决定写这篇文章的一个原因就是,当我发现
黑莓 手机 屏幕发展历程对比 blackberry 各型号屏幕大小   黑莓手 机 一直在不断发展且新机型 也在不断上市. 因此,不同黑莓机型的屏幕分辨率也在不断变化着. 总的来说,屏幕分辨率一直在提高并且越来越清晰.我们对所有的黑莓 机型的屏幕分辨率做了个对比.~51blackberry ~com     可能大家特别感兴趣是新发布的黑莓机型,它的分辨率也是黑莓 机型中前所未有的.   黑莓 b
      公司里没有我用惯的UltraEdit的lisence了, 只能无奈转向开源的Notepad++, 找了半天才知道配置运行Perl的办法。         1,用Notepad++打开.pl文件,         2, F5或者Run->Run,打开运行窗口,在下面的框框里输入:Perl -w "$(FULL_CURRENT_PATH)", 然后Save,保存成一个命令就行,名字比如叫R