


· 在家自己教中文 (2010-4-28) 春来草自青 · 墨尔本私校奖学金之纸上谈兵(2009年12月整理) (2009-12-7) snowbird
· 桂林散记 (2004-12-13) leeshine · 2012款Nissan Murano ST 悉尼2.7%贷款购买小记(内附试驾感受)6月9日最新进展:提车要6-8周后,死磕黑色到底! (2012-5-12) li_yu84
查看: 2165|回复: 22

SAN Admin 的"精彩"人生 [复制链接]


发表于 2010-11-16 23:52 |显示全部楼层
此文章由 JuJu 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 JuJu 所有!转贴必须注明作者、出处和本声明,并保持内容完整
今天看这家伙的博克, 乐死我, 怎么听起来如此熟悉啊,resume 里可以照抄这么一段

http://storagebod.typepad.com/st ... n-benchmarking.html

Assault on Benchmarking
I've been thinking a bit about benchmarking and benchmarketing; pretty everyone agrees that SPC is a very poor representation of real world storage performance but at the moment, it's the only thing that most of the market supports with one key exception. So I thought I'd come up with my own, so let me introduce the SSAC.

Storagebod's Storage Assault Course

This is a multi-part benchmark and is supposed to reflect the real world life of an storage array. You may bid what you like but all costs including the costs of team who support and run the benchmark must be declared.

1) You must specify an array to run the SPC benchmarking suite; this array must be ordered through the normal ordering process.

2) A short period of time prior to the delivery of the array; you will be informed that the workload for the array has been changed to another random work. However, the delivery date agreed must still be met; so any changes to the configuration must be made without impact to the delivery.

3) You must carry out an audit of the existing SAN environment and ensure that the installation of your array will not cause any impact to the already running environment. In the course of the audit, you will discover that pretty much the whole of the environment is down-level and not currently certified against your new array. You must agree any outage required to upgrade the new array; this may involve you auditing every single server and switch to ensure that key variables such as time-outs on multipathing are set correctly.

The day before installation, you will discover a dozen servers which you were not aware of and at least three of these will be running operating systems which are so far out of support that no-one is sure what is going to happen.

You will be responsible for raising changes, carrying out risk assessments and arranging site surveys, traffic plans and any other supporting tasks.

4) On day of delivery, you will discover that the array is to be installed in another part of the data centre which neither has power and the array will probably fall through the floor crushing the secretarial team below.

You will be responsible for arranging the remedial works and rescheduling all the changes required.

5) Once the array is installed, you will be responsible for powering it up and installing the initial configuration.

6) A Major Production Incident is declared and you will be responsible for convincing everyone that it is not your new array which caused the problem.

7) You are finally allowed to run your first benchmark.

8) You discover that all of the SAN switches have been set to run at the wrong speed and you need to raise changes to correct this.

9) You are allowed to re-run your benchmark again but halfway through your benchmark a Performance Major Production Incident is declared and you are responsible for first proving that it is not your array which caused the problem and then fixing the Incident which is nothing to do with the work you are carrying out.

10) You manage to successfully run your benchmark.

11) You receive an urgent change request and you must find space on your array for a new workload; this workload has completely different performance requirements but you will told to JFDI.
12) You run your second benchmark.

13) You receive an urgent call from the original application team; apparently they mis-stated their requirement and they are servicing twice as many requests as originally thought. You must expand the environment as an urgent priority.

14) Through the normal ordering channels, you must arrange any necessary expansion to the current environment.

15) You must carry out any necessary reconfiguration without impact to the running workloads or agree any downtime with the business to allow you to carry out the necessary reconfiguration.

16) You will recieve an emergency technical notice from your vendor support teams; a fix must be implemented immediately or the array will catch fire (possibly).

17) You must re-audit the environment and carry out any remedial work on servers, switches and applications to support the new code level.

18) You will be informed that the system must now have a DR capability due to government regulation.

19) Carry out any design and certification work to provide DR capability. The order for the DR capability will be held up in procurement but you still need to implement within the government timescales or face severe financial penalties.

20) Re-benchmark the entire environment showing the impact of replication on the environment.

Explain to a non-technical person that the speed of light differs in fibre to that of that in vacuum. Explain in a louder voice that no, the speed of light is not something which can be overcome and no, rival company X has not overcome despite what they might say.

21) A third workload will be detailed; this must be shoe-horned onto the array at no extra cost but with no impact to the additional applications.

22) Benchmark again to prove this case.

23) You are summoned to a meeting to explain and justify your array to a Senior Manager who has just been bought a three star lunch by a rival vendor.

24) You will experience a failure on your array; record how many logs and diagnostics that the vendor support team (and no cheating, you must use the customer route) ask you before agreeing to replace the failed disk.

25) You will be informed that Group Audit have looked at the application that you provided DR for and decided that actually it was exempt, hence you have wasted time and money.

26) Plan to repurpose DR array for a new workload requirement.

27) A new capability is announced and will require a major firmware upgrade that may require downtime but your support team are not sure. Carry out the risk analysis and plan for upgrades.

28) You are outsourced!

At this stage please disclose:

1) Total Staff costs including time-off for stress related illness

2) Total Capital Expenditure

3) Total Downtime due to upgrades and other non-disruptive activities

4) Any other expenditure due to work required to existing SAN environment and Server infrastructure

And those performance benchmarks, throw them away; you didn't think I was actually interested in them!?


参与人数 3积分 +19 收起 理由
乱码 + 3 感谢分享
daffodil + 8
bulaohu + 8 LOL



发表于 2010-11-17 10:46 |显示全部楼层
此文章由 xyan1 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 xyan1 所有!转贴必须注明作者、出处和本声明,并保持内容完整

发表于 2010-11-17 13:37 |显示全部楼层
此文章由 koyuu 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 koyuu 所有!转贴必须注明作者、出处和本声明,并保持内容完整
vendor support 很重要,死也要拉个垫背的 呵呵  不过小心一些 规划做好  一切都按照文档来 现在的san还是很健壮的 出问题的机会不大 但也要看RP咯

退役斑竹 2008年度奖章获得者

发表于 2010-11-18 04:11 |显示全部楼层
此文章由 daffodil 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 daffodil 所有!转贴必须注明作者、出处和本声明,并保持内容完整

发表于 2010-11-18 16:37 |显示全部楼层
此文章由 gnorud 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 gnorud 所有!转贴必须注明作者、出处和本声明,并保持内容完整
待会儿转给team 。lol


发表于 2010-11-18 19:20 |显示全部楼层
此文章由 JuJu 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 JuJu 所有!转贴必须注明作者、出处和本声明,并保持内容完整
原帖由 daffodil 于 2010-11-18 04:11 发表

SAN 说简单也挺简单的, 通常就是指跟人人都有的Ethernet network 分开的一个用fibre channel 的 storage network, 很早的时候, enterprise server 需要disk storage 时如果internal storage 不够时一般是用SCSI DAS的, 用SCSI cable 直接连在 server上, 或内装RAID controller加外接JBOD, 或用稍贵一点的连RAID controller 也在外端的那种, 这种SCSI DAS 扩展性不够好, 后来发展成了用SCSI protocol 来传输数据的Fibre channel 的storage network , SCSI cable 被storage network所代替, 很多台server 都可以用HBA 连在storage network上,共用里面的storage array, FC storage network 里面的device, 象array 和switch 都用fibre channel interface, host 方面要有HBA卡(最常见的就是Emulex和Qlogic卡). 跟传统的SCSI DAS比, 有很多优点, performance 好, 容易扩展, 而且可以用上那些增加high availability的技术, 象local data copy, remote data copy, multiple path, 同步, 异步, 三地等等之类, 最大限度地保证数据的availability和business continuity.  因此对于银行之类地方的关键application, 一般都用这个. Storage array 的vendor 有HDS, EMC, IBM, HP, NetApp等, SAN switch 的vendor主要有Cisco, Brocade, McDATA(已被Brocade收购). 历史上Brocade, McDATA曾一统天下,基本上占据fiber switch 的全部市场, 但是最近几年Cisco fibre switch的发展势头很猛, 越来越多的公司用Cisco的fibre switch. 这个SAN Infrastructure 里面一般还有tape library 之类. 跟另一个常在一起被提起的技术NAS相比, SAN, 不管是fibre channel SAN or ISCSI channel SAN, 传输数据是block level的, 跟NAS完全不同, NAS是用CIFS/NFS 这些protocol, 对于数据的传输是file level 的. SAN 的performance 是要比NAS好得多.

但是SAN很贵,那些高端array solution 动辄可以上百万, HBA和fibre switch也都不便宜, 加上DR就更贵了, 再加上local copy, remote copy, 费用几倍地上去了.如果公司以前没有SAN, 那么一切硬件全部要投资,因此ISCSI近年来开始很热, 因为用ISCSI可以利用现成的Ethernet network, 又叫storage over IP, 不用另外再去建一个SAN network. 这些storage vendor基本上都推出了有ISCSI interface 的 array, 一开始一般都是低端产品,现在高端的也都纷纷有了,ISCSI的现在基本传输速度是1Gbps, 跟最普遍的4Gbps的fiber Channel fabric 的传输速度比还是要慢不少, 好处是只需维护一个Ethernet network 就好了, 不过随着10 GbE网络的发展, ISCSI 比FC更加受欢迎也大有可能.

另外还有一个大多数用户公司纷纷计划中的就是FCoE了, 各大storage vendor 也都在推出中, 等到10 GbE 成为潮流, FCoE产品更成熟的时候,估计就会有大公司正式上马了.

[ 本帖最后由 JuJu 于 2010-11-18 19:38 编辑 ]


参与人数 3积分 +39 收起 理由
bulaohu + 16 你太有才了
degra + 15 你太有才了
daffodil + 8 感谢分享




发表于 2010-11-18 19:22 |显示全部楼层
此文章由 粉猪妈妈 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 粉猪妈妈 所有!转贴必须注明作者、出处和本声明,并保持内容完整


发表于 2010-11-18 19:24 |显示全部楼层
此文章由 JuJu 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 JuJu 所有!转贴必须注明作者、出处和本声明,并保持内容完整

退役斑竹 2008年度奖章获得者

发表于 2010-11-18 20:41 |显示全部楼层
此文章由 daffodil 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 daffodil 所有!转贴必须注明作者、出处和本声明,并保持内容完整
看个大概, 不过JUJU你写得真的很清楚了. 我们的数据文件都是存在带RAID的SAN上,  定期填写增加SAN的CR,  以前我没有NETWORK的概念, 以为就是个DISK, 想不通I/O怎么控制. 一直想去DATA CENTER跑一趟, 看看到底是什么结构, 现在心中大致有数, 不去也罢.
Faith Hope Love

退役斑竹 2008年度奖章获得者

发表于 2010-11-18 20:55 |显示全部楼层
此文章由 daffodil 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 daffodil 所有!转贴必须注明作者、出处和本声明,并保持内容完整
原帖由 JuJu 于 2010-11-18 19:20 发表
但是SAN很贵,那些高端array solution 动辄可以上百万, HBA和fibre switch也都不便宜, 加上DR就更贵了, 再加上local copy, remote copy, 费用几倍地上去了

深有同感, 我们每次填CR, 领导都要大骂一通, 说我们没把数据安排好, 要占这么多空间, PROD, DR, LOGICAL STANDBY, UAT, 每加一条, 哪怕只有20G也要X4,
Faith Hope Love


发表于 2010-11-18 21:42 |显示全部楼层
此文章由 JuJu 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 JuJu 所有!转贴必须注明作者、出处和本声明,并保持内容完整
原帖由 daffodil 于 2010-11-18 20:41 发表
看个大概, 不过JUJU你写得真的很清楚了. 我们的数据文件都是存在带RAID的SAN上,  定期填写增加SAN的CR,  以前我没有NETWORK的概念, 以为就是个DISK, 想不通I/O怎么控制. 一直想去DATA CENTER跑一趟, 看看到底是什么 ...

你是说array 本身的结构啊, 那比传统的SCSI DAS 的JBOD(Just A Bunch Of Disks)可intellegent 多了, 里面有带很多precossors 和cache来处理I/O, 不过你去了data center 也看不出啥的, 多半就是看到几个rack 里面一层一层的


参与人数 2积分 +7 收起 理由
Dance + 3 偶对你的景仰如滔滔江水
LoveAu + 4 感谢分享




发表于 2010-11-18 21:46 |显示全部楼层
此文章由 JuJu 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 JuJu 所有!转贴必须注明作者、出处和本声明,并保持内容完整
原帖由 daffodil 于 2010-11-18 20:55 发表

深有同感, 我们每次填CR, 领导都要大骂一通, 说我们没把数据安排好, 要占这么多空间, PROD, DR, LOGICAL STANDBY, UAT, 每加一条, 哪怕只有20G也要X4,

所以我说主楼还漏了不少SAN ADMIN "幸福"人生所要经历的乐趣, 还可以加上, "浪费几个小时跟领导的领导解释为啥我们买了这么贵的东东, 干吗不买一堆1TB的外接硬盘来解决问题呢, 现在都这么便宜了!!"

发表于 2010-11-18 21:59 |显示全部楼层
此文章由 sera_aus 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 sera_aus 所有!转贴必须注明作者、出处和本声明,并保持内容完整
原帖由 JuJu 于 2010-11-18 19:20 发表

SAN 说简单也挺简单的, 通常就是指跟人人都有的Ethernet network 分开的一个用fibre channel 的 storage network, 很早的时候, enterprise server 需要disk storage 时如果internal storage 不够时一般是用SCSI DA ...


退役斑竹 2008年度奖章获得者

发表于 2010-11-18 22:12 |显示全部楼层
此文章由 daffodil 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 daffodil 所有!转贴必须注明作者、出处和本声明,并保持内容完整
原帖由 JuJu 于 2010-11-18 21:42 发表

你是说array 本身的结构啊, 那比传统的SCSI DAS 的JBOD(Just A Bunch Of Disks)可intellegent 多了, 里面有带很多precossors 和cache来处理I/O, 不过你去了data center 也看不出啥的, 多半就是看到几个rack 里面 ...

也不全是ARRAY本身的结构, 我本来以为SAN就是用来存储的HARDWARE, 但怎么样控制, 怎么样实现HA不太明白, 现在你说是NETWORK, 这方面的疑问就解答了.
Faith Hope Love

退役斑竹 2008年度奖章获得者

发表于 2010-11-18 22:16 |显示全部楼层
此文章由 daffodil 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 daffodil 所有!转贴必须注明作者、出处和本声明,并保持内容完整
原帖由 JuJu 于 2010-11-18 21:46 发表

所以我说主楼还漏了不少SAN ADMIN "幸福"人生所要经历的乐趣, 还可以加上, "浪费几个小时跟领导的领导解释为啥我们买了这么贵的东东, 干吗不买一堆1TB的外接硬盘来解决问题呢, 现在都这么便宜了!!" ...

以前在一TELE BILLING公司做, 一太平洋岛国居然买了我们的BILLING SYSTEM, 但舍不得买硬盘, 有个月帐单没法生成, 缺了2GB, 我们的BA大无畏地说, 把我的U盘寄给他们吧,


参与人数 1积分 +1 收起 理由
JuJu + 1 haha


Faith Hope Love

发表于 2010-11-18 22:24 |显示全部楼层
此文章由 koyuu 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 koyuu 所有!转贴必须注明作者、出处和本声明,并保持内容完整
存储的精品培训--可值100两银子!SAN101  不过文件太大了 没法上传.

SAN就是网络啦, storage area network ....

20G 是说扩File System吧

[ 本帖最后由 koyuu 于 2010-11-18 22:31 编辑 ]


参与人数 1积分 +2 收起 理由
daffodil + 2 nod



发表于 2010-11-18 22:35 |显示全部楼层
此文章由 xyan1 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 xyan1 所有!转贴必须注明作者、出处和本声明,并保持内容完整

发表于 2010-11-18 22:49 |显示全部楼层
此文章由 koyuu 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 koyuu 所有!转贴必须注明作者、出处和本声明,并保持内容完整
分卷太麻烦了 不到1M的东西 要分10个 要的话留个邮箱 我一发得了  很基础的SAN的培训文档 高手请略过...


参与人数 2积分 +5 收起 理由
JuJu + 3 谢谢奉献
DEC + 2 感谢分享


发表于 2010-11-18 23:11 |显示全部楼层
此文章由 LoveAu 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 LoveAu 所有!转贴必须注明作者、出处和本声明,并保持内容完整
JuJu好像啥都懂, 又学习了。谢谢


参与人数 1积分 +1 收起 理由
JuJu + 1 哪里都懂啦



发表于 2010-11-18 23:13 |显示全部楼层
此文章由 月亮 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 月亮 所有!转贴必须注明作者、出处和本声明,并保持内容完整


参与人数 1积分 +1 收起 理由
JuJu + 1 哪敢牛


发表于 2010-11-18 23:30 |显示全部楼层
此文章由 西门吹哨 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 西门吹哨 所有!转贴必须注明作者、出处和本声明,并保持内容完整
原帖由 JuJu 于 2010-11-18 21:46 发表

所以我说主楼还漏了不少SAN ADMIN "幸福"人生所要经历的乐趣, 还可以加上, "浪费几个小时跟领导的领导解释为啥我们买了这么贵的东东, 干吗不买一堆1TB的外接硬盘来解决问题呢, 现在都这么便宜了!!" ...

Take it up

发表于 2010-11-19 13:19 |显示全部楼层
此文章由 koyuu 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 koyuu 所有!转贴必须注明作者、出处和本声明,并保持内容完整
值钱的是Data 不是哪一堆硬件。 没有protection的data就和没有保险柜的金库一样 。 硬件卖的贵是因为能最大限度的保护data不丢失....


发表于 2010-11-19 17:04 |显示全部楼层
此文章由 JuJu 原创或转贴,不代表本站立场和观点,版权归 oursteps.com.au 和作者 JuJu 所有!转贴必须注明作者、出处和本声明,并保持内容完整
原帖由 西门吹哨 于 2010-11-18 23:30 发表




您需要登录后才可以回帖 登录 | 注册

