How YouPorn Uses Redis: SFW Edition

2023-10-30

I interviewed Eric Pickup, IT Lead at the Manwin group (the company behind sites like YouPorn and Pornhub), to tell us about their transition to using Redis, why they made the switch, and how well it’s worked. Check out Eric’s presentation on Building a Website to Scale, or get started with your own free Redis instance.  

Justin:   Can you talk about when and why you guys made the transition to Redis?

Eric:   Basically, about two years ago we acquired the site [YouPorn]. It was written in Perl at the time, which was one of the reasons I was brought on board. Although I had a history of working with Perl, we quickly decided it was just not feasible to maintain. There just aren’t enough developers around, especially strong senior developers. So, if we were to keep it in Perl it was going to be a pretty stagnant site, which is something we obviously didn’t want to do.

Right away, the decision was made to rewrite it and we started looking at different technologies. Our first instinct was actually PHP, but we didn’t want to limit ourselves so we also looked at Java based solutions. After a bunch of research and looking at what technologies we’ve been experimenting with internally, we decided to stick with PHP.

Previously we’d been experimenting with Redis, Varnish and a few other technologies. Some sites internally had already started to use Redis, mostly as a caching solution, but we wanted to see if we could use it as a real data store.

cat-on-computer

We did some early tests and based our decision mainly on performance, since that was (and is) a huge issue for us. We were very, very impressed with Redis’ general performance, and after some discussion we decided we were going to use Redis as the primary database for the website.

Previously the site had been written in a traditional LAMP stack. It had Linux, Perl, MySQL and Memcached. There were obviously some concerns about the transition. One of the tradeoffs, which I’m actually really glad we did in hindsight, was we kept MySQL in the picture. We don’t read from MySQL on the website, but we are able to use it to do things like populate new lists or hashes, as well as things we couldn’t anticipate ahead of time. We have MySQL more for ad hoc queries, and use Redis for the website.

Soon after we started developing with it, we pretty quickly we felt we’d made the right decision. For the first month or so, we were prepared to reexamine our decision but became comfortable pretty quickly. It was really a good fit for our use case.

Justin:   Why is that, and what were you looking at in terms of evaluating whether or not it was a good decision?

Eric:   Obviously ease of development is a huge one, especially when you are rewriting an entire project like this. Luckily, Redis’ data structures mapped well to what we were doing.

YouPorn at the end of the day is mostly about lists of videos and lists of objects, whether it be comments, favorites, etc. the top rated videos, or the most viewed videos. It’s all lists and then objects, which obviously map well to hashes. We do use some of the other data types but I’d have to say that about 90% of our usage falls into the case of either sorted sets or hashes.

Justin:   After deciding to use Redis, how long did it take to actually implement and have it working?

Eric:   Honestly, back at that point we were still ramping up the team. Like I said, it was a brand new project so it was mostly me and one other person when we did the initial staging.

I’d say within four weeks or so we had a good part of the site prototyped. We had the front page, all the main pages, and most of the video pages done. You could view comments – although at that point you couldn’t add comments – but a lot was done in just four weeks with just two people. This timeframe included learning a new framework (Symphony at the time) so we got up and running pretty quickly.

Justin:   How many instances are you using?

Eric:   I can’t get into specific numbers, but it’s fewer than 10.

Justin:   That’s really impressive. How did you guys manage to have so few?

Eric:   It’s grown over time as we have added functionality, but generally speaking we do a lot of caching with Redis. When we first launched the site, we did no caching. We just relied on Redis.

Over time we found the servers are running a little too hot for our tastes, so we started adding certain levels of caching. We’d have a second Redis node running on the website itself with very short cache times just to handle very popular page views.

You also have to understand that we use Varnish, which sits in front of the web servers so the pages themselves are cached quite a bit so we’re not serving every page via Redis.

Justin:   When you went about making architecture decisions, can you talk about how you decided where to use Redis, and if you made any changes along the way?

Eric:   I’d say Redis was one of the first technologies that we knew we were going to use. That and Varnish, they were early decisions. Our tests on them were pretty good and, like I said, they have already been used by the company before so they weren’t unknown to us.

In terms of what we changed, the biggest change was adding a secondary Redis caching layer. It’s really lowered the queries per second on the servers and allowed us to have more of a safety net there.

Justin:   What would you say the biggest benefit has been after implementing this?

Eric:   For one, I would say the ability to rapidly create new features has been quite powerful with Redis. I mean it’s not just Redis, it’s the full software stack, but we’ve written a nice library that sits on top of the basic Redis libraries which allows us to quickly put together new features. That’s definitely been the biggest benefit we’ve seen.

Justin:   What were some roadblocks or difficulties in making this transition? Was there any custom stuff that you had to figure out and do on your own?

Eric:   Let me think here. Implementing the caching layer took some time. Like I said, the servers were running very hot and we didn’t really want to start throwing more and more servers at the problem, so building a solution took some time.

The other thing that took some time was figuring things out. These days, most websites built using Linux systems are using MySQL as the data store. MySQL does have a huge advantage in that there is lots of documentation. If you run into a problem, chances are somebody has already dealt with it before and you’ll find dozens of sites with information and advice. Redis just doesn’t have that type of community yet. If you want to read testimonials by other people that have set it up and what they’ve learned, what settings they’ve used, what their experiences are, there is a lot less information out there. There are a lot less tips and tricks so there’s more of a learning curve.

There’s just not as much documentation out there compared to MySQL, so finding solutions to issues or simpler things, like setting up replications to disk, took a bit more time. However, as Redis is becoming more popular, the documentation and community is starting to form.

Justin:   Do you have any tips or tricks that you’d like to share with our audience?

Eric:   I’d say most of the most valuable ones, I just don’t know enough about. I’m not a systems man and a lot of it was basically system type stuff. I’d say one trick that’s easy to miss is when you’re setting up replication to disk and you have a cluster of master and slaves, make sure that there is enough time between each one so you don’t end up in a situation where they all decide to write to disk at the same time.

It’s very easy to overlook. Our initials servers were all good but later on when we added more servers occasionally we kept the default settings, which was something we had to fix. It’s one of those things that people could benefit a lot from. I’m a software developer, and I’d say most of the real lessons learned were more at the systems level. I don’t have enough information to really go into those.

Justin:   Great. Thanks for an awesome interview, and hope things continue to go well at Manwin!

Eric:   Thanks for having me.

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

How YouPorn Uses Redis: SFW Edition 的相关文章

  • 基于SSM的智慧城市实验室主页系统的设计与实现

    末尾获取源码 开发语言 Java Java开发工具 JDK1 8 后端框架 SSM 前端 采用Vue技术开发 数据库 MySQL5 7和Navicat管理工具结合 服务器 Tomcat8 5 开发软件 IDEA Eclipse 是否Mave
  • Python 最常用模块函数代码汇总解析

    一 文件和目录操作 创建 删除 修改 拼接 获取当前目录 遍历目录下的文件 获取文件大小 修改日期 判断文件是否存在等 二 日期和时间 内置模块 time datatime calendar 1 time time 返回自1970年1月1日
  • Go换国内源解决go get -u 问题

    Go版本1 13及以上 Windows在编译器终端执行以下操作 go env w GO111MODULE on go env w GOPROXY https goproxy cn direct MacOS或Linux export GO11
  • java 可变参数 详解(通俗易懂)

    目录 一 概述 二 格式 三 注意事项 使用规范 四 代码演示 演示规范 演示规范 演示规范 课堂练习 代码演示 输出结果 五 英文版本讲解 一 概述 java中 我们可以将名称相同 功能也相同 但是形参个数不同的多个函数 封装为某个类中的
  • avue form弹框里改动label

    可参考官网弹窗表单配置 Avue
  • springboot 配置logback

    logback spring xml文件配置
  • Oracle数据库报错ERROR at line 1:ORA-01157: cannot identify/lock data file 9

    自说 今天在打开了好久没有打开的rac数据库时 重启数据库进入open模式时发生了以下错误 经过简单筛查后发现是因为之前创建的数据文件删除掉了 因为我这里是保存到了本地中 E盘下 未找到导致报错 我们可以查看 set linesize 19
  • 算法——无向图的最短路径算法

    https www jb51 net article 154796 htm 我是看上面的文章写的程序 他的第一种解法还需要我再理解理解 BFS一层层寻找目标节点的算法 思路 1 先从v到u的使用BFS遍历一遍图 得到每个节点到v的最短距离
  • C/C++ stat()函数:获取文件状态

    相关函数 fstat lstat chmod chown readlink utime 头文件 include
  • 使用R语言绘制指数分布密度函数数据的可视化

    使用R语言绘制指数分布密度函数数据的可视化 指数分布是概率论和统计学中常见的连续概率分布之一 广泛应用于可靠性工程 风险分析和排队论等领域 在本文中 我们将使用R语言的plot函数来可视化指数分布密度函数的数据 首先 我们需要安装并加载R语
  • Pytorch并行计算(二): DistributedDataParallel介绍

    PyTorch并行计算 DistributedDataParallel 一 为什么要并行计算 二 基本概念 三 DistributedDataParallel的使用 1 torch distributed 2 torch multiproc
  • 三、Express

    目录 初识express Express 简介 Express 的基本使用 安装 基本使用 托管静态资源 express static 托管多个静态资源目录 挂载路径前缀 nodemon Express路由 路由的基本使用 为路由模块添加前
  • 使用容器搭建伪redis集群

    在一个主机上使用容器技术搭建一个redis集群 为什么说是伪集群 因为redis集群和分布式相互交叉 因为成本 在一台主机上部署一个三主三从的redis集群 redis版本 v 6 2 6 部署 运行六个节点 docker compose
  • 常见垃圾回收器

    CMS和G1是最重要的 新生代一般采用标记复制 老年代一般采用标记整理算法 Serial 垃圾回收线程只有一个 而且垃圾回收线程工作的时候其他用户线程要停下来 Parnew Serial的多线程版本 有多个垃圾回收线程 垃圾回收线程工作的时
  • 黎明觉醒火种测试服务器维护,黎明觉醒火种测试什么时候上线 黎明觉醒火种测试资格获取方式(图文)...

    黎明觉醒是腾讯旗下的多人开放世界生存手游 对标的就是网易旗下的明日之后 在之前的曙光测试之后 这款游戏长时间来都没有传出过新消息 下面game234就来介绍一下黎明觉醒最新的火种测试什么时候上线 怎么预约 game234将第一时间提供黎明觉
  • 区块链是怎么形成的,你究竟明白多少?

    区块链到底是啥 首先 不要把区块链想的很复杂 其实 区块链很简单 它本质上就是一套数据库存储系统 该系统分布在全球各地 并且能够协同运转 不过 与其他数据库存储系统不一样的是 这个系统的运行者可以是任何有能力架设服务器的人 过去 传统的数据
  • angular 12+NG-ZORRO -UI中使用Modal对话框时注意

    弹框的代码不能放在循环中不然就会出现黑屏了 当时我的代码是这样写的 当然这是我的错误写法 特此记录 div class pages div
  • 日志服务器搭建

    1 安装完系统后 配置网络 设置静态IP vi etc sysconfig network scripts ifcfg ens33 编辑模式下修改 i BOOTPROTO static 改为静态 ONBOOT YES IPADDR 192
  • DeFi新篇章

    随着原生去中心化中央限价订单簿 Central Limit Order Book CLOB DeepBook的推出 Sui上的DeFi开启了新篇章 DeepBook由一群Sui贡献者共同构建 为新一代DeFi应用提供了一个稳定的流动性层 通
  • win10无法访问smb共享文件夹的解决办法

    win10无法访问smb共享文件夹的解决办法 之前在linux的几个图形化界面都可以在文件夹中输入 smb ip share 直接访问Linux服务器上的共享文件夹 但是在win10上进行同样的操作会让我打开win10商店搜索应用程序 网上

随机推荐