How YouPorn Uses Redis: SFW Edition

2023-10-30

I interviewed Eric Pickup, IT Lead at the Manwin group (the company behind sites like YouPorn and Pornhub), to tell us about their transition to using Redis, why they made the switch, and how well it’s worked. Check out Eric’s presentation on Building a Website to Scale, or get started with your own free Redis instance.

Justin: Can you talk about when and why you guys made the transition to Redis?

Eric: Basically, about two years ago we acquired the site [YouPorn]. It was written in Perl at the time, which was one of the reasons I was brought on board. Although I had a history of working with Perl, we quickly decided it was just not feasible to maintain. There just aren’t enough developers around, especially strong senior developers. So, if we were to keep it in Perl it was going to be a pretty stagnant site, which is something we obviously didn’t want to do.

Right away, the decision was made to rewrite it and we started looking at different technologies. Our first instinct was actually PHP, but we didn’t want to limit ourselves so we also looked at Java based solutions. After a bunch of research and looking at what technologies we’ve been experimenting with internally, we decided to stick with PHP.

Previously we’d been experimenting with Redis, Varnish and a few other technologies. Some sites internally had already started to use Redis, mostly as a caching solution, but we wanted to see if we could use it as a real data store.

cat-on-computer

We did some early tests and based our decision mainly on performance, since that was (and is) a huge issue for us. We were very, very impressed with Redis’ general performance, and after some discussion we decided we were going to use Redis as the primary database for the website.

Previously the site had been written in a traditional LAMP stack. It had Linux, Perl, MySQL and Memcached. There were obviously some concerns about the transition. One of the tradeoffs, which I’m actually really glad we did in hindsight, was we kept MySQL in the picture. We don’t read from MySQL on the website, but we are able to use it to do things like populate new lists or hashes, as well as things we couldn’t anticipate ahead of time. We have MySQL more for ad hoc queries, and use Redis for the website.

Soon after we started developing with it, we pretty quickly we felt we’d made the right decision. For the first month or so, we were prepared to reexamine our decision but became comfortable pretty quickly. It was really a good fit for our use case.

Justin: Why is that, and what were you looking at in terms of evaluating whether or not it was a good decision?

Eric: Obviously ease of development is a huge one, especially when you are rewriting an entire project like this. Luckily, Redis’ data structures mapped well to what we were doing.

YouPorn at the end of the day is mostly about lists of videos and lists of objects, whether it be comments, favorites, etc. the top rated videos, or the most viewed videos. It’s all lists and then objects, which obviously map well to hashes. We do use some of the other data types but I’d have to say that about 90% of our usage falls into the case of either sorted sets or hashes.

Justin: After deciding to use Redis, how long did it take to actually implement and have it working?

Eric: Honestly, back at that point we were still ramping up the team. Like I said, it was a brand new project so it was mostly me and one other person when we did the initial staging.

I’d say within four weeks or so we had a good part of the site prototyped. We had the front page, all the main pages, and most of the video pages done. You could view comments – although at that point you couldn’t add comments – but a lot was done in just four weeks with just two people. This timeframe included learning a new framework (Symphony at the time) so we got up and running pretty quickly.

Justin: How many instances are you using?

Eric: I can’t get into specific numbers, but it’s fewer than 10.

Justin: That’s really impressive. How did you guys manage to have so few?

Eric: It’s grown over time as we have added functionality, but generally speaking we do a lot of caching with Redis. When we first launched the site, we did no caching. We just relied on Redis.

Over time we found the servers are running a little too hot for our tastes, so we started adding certain levels of caching. We’d have a second Redis node running on the website itself with very short cache times just to handle very popular page views.

You also have to understand that we use Varnish, which sits in front of the web servers so the pages themselves are cached quite a bit so we’re not serving every page via Redis.

Justin: When you went about making architecture decisions, can you talk about how you decided where to use Redis, and if you made any changes along the way?

Eric: I’d say Redis was one of the first technologies that we knew we were going to use. That and Varnish, they were early decisions. Our tests on them were pretty good and, like I said, they have already been used by the company before so they weren’t unknown to us.

In terms of what we changed, the biggest change was adding a secondary Redis caching layer. It’s really lowered the queries per second on the servers and allowed us to have more of a safety net there.

Justin: What would you say the biggest benefit has been after implementing this?

Eric: For one, I would say the ability to rapidly create new features has been quite powerful with Redis. I mean it’s not just Redis, it’s the full software stack, but we’ve written a nice library that sits on top of the basic Redis libraries which allows us to quickly put together new features. That’s definitely been the biggest benefit we’ve seen.

Justin: What were some roadblocks or difficulties in making this transition? Was there any custom stuff that you had to figure out and do on your own?

Eric: Let me think here. Implementing the caching layer took some time. Like I said, the servers were running very hot and we didn’t really want to start throwing more and more servers at the problem, so building a solution took some time.

The other thing that took some time was figuring things out. These days, most websites built using Linux systems are using MySQL as the data store. MySQL does have a huge advantage in that there is lots of documentation. If you run into a problem, chances are somebody has already dealt with it before and you’ll find dozens of sites with information and advice. Redis just doesn’t have that type of community yet. If you want to read testimonials by other people that have set it up and what they’ve learned, what settings they’ve used, what their experiences are, there is a lot less information out there. There are a lot less tips and tricks so there’s more of a learning curve.

There’s just not as much documentation out there compared to MySQL, so finding solutions to issues or simpler things, like setting up replications to disk, took a bit more time. However, as Redis is becoming more popular, the documentation and community is starting to form.

Justin: Do you have any tips or tricks that you’d like to share with our audience?

Eric: I’d say most of the most valuable ones, I just don’t know enough about. I’m not a systems man and a lot of it was basically system type stuff. I’d say one trick that’s easy to miss is when you’re setting up replication to disk and you have a cluster of master and slaves, make sure that there is enough time between each one so you don’t end up in a situation where they all decide to write to disk at the same time.

It’s very easy to overlook. Our initials servers were all good but later on when we added more servers occasionally we kept the default settings, which was something we had to fix. It’s one of those things that people could benefit a lot from. I’m a software developer, and I’d say most of the real lessons learned were more at the systems level. I don’t have enough information to really go into those.

Justin: Great. Thanks for an awesome interview, and hope things continue to go well at Manwin!

Eric: Thanks for having me.

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

How YouPorn Uses Redis: SFW Edition 的相关文章

基于SSM的智慧城市实验室主页系统的设计与实现

末尾获取源码开发语言 Java Java开发工具 JDK1 8 后端框架 SSM 前端采用Vue技术开发数据库 MySQL5 7和Navicat管理工具结合服务器 Tomcat8 5 开发软件 IDEA Eclipse 是否Mave
Python 最常用模块函数代码汇总解析

一文件和目录操作创建删除修改拼接获取当前目录遍历目录下的文件获取文件大小修改日期判断文件是否存在等二日期和时间内置模块 time datatime calendar 1 time time 返回自1970年1月1日
Go换国内源解决go get -u 问题

Go版本1 13及以上 Windows在编译器终端执行以下操作 go env w GO111MODULE on go env w GOPROXY https goproxy cn direct MacOS或Linux export GO11
java 可变参数详解（通俗易懂）

目录一概述二格式三注意事项使用规范四代码演示演示规范演示规范演示规范课堂练习代码演示输出结果五英文版本讲解一概述 java中我们可以将名称相同功能也相同但是形参个数不同的多个函数封装为某个类中的
avue form弹框里改动label

可参考官网弹窗表单配置 Avue
springboot 配置logback

logback spring xml文件配置
Oracle数据库报错ERROR at line 1:ORA-01157: cannot identify/lock data file 9

自说今天在打开了好久没有打开的rac数据库时重启数据库进入open模式时发生了以下错误经过简单筛查后发现是因为之前创建的数据文件删除掉了因为我这里是保存到了本地中 E盘下未找到导致报错我们可以查看 set linesize 19
算法——无向图的最短路径算法

https www jb51 net article 154796 htm 我是看上面的文章写的程序他的第一种解法还需要我再理解理解 BFS一层层寻找目标节点的算法思路 1 先从v到u的使用BFS遍历一遍图得到每个节点到v的最短距离
C/C++ stat()函数：获取文件状态

相关函数 fstat lstat chmod chown readlink utime 头文件 include
使用R语言绘制指数分布密度函数数据的可视化

使用R语言绘制指数分布密度函数数据的可视化指数分布是概率论和统计学中常见的连续概率分布之一广泛应用于可靠性工程风险分析和排队论等领域在本文中我们将使用R语言的plot函数来可视化指数分布密度函数的数据首先我们需要安装并加载R语
Pytorch并行计算(二): DistributedDataParallel介绍

PyTorch并行计算 DistributedDataParallel 一为什么要并行计算二基本概念三 DistributedDataParallel的使用 1 torch distributed 2 torch multiproc
三、Express

目录初识express Express 简介 Express 的基本使用安装基本使用托管静态资源 express static 托管多个静态资源目录挂载路径前缀 nodemon Express路由路由的基本使用为路由模块添加前
使用容器搭建伪redis集群

在一个主机上使用容器技术搭建一个redis集群为什么说是伪集群因为redis集群和分布式相互交叉因为成本在一台主机上部署一个三主三从的redis集群 redis版本 v 6 2 6 部署运行六个节点 docker compose
常见垃圾回收器

CMS和G1是最重要的新生代一般采用标记复制老年代一般采用标记整理算法 Serial 垃圾回收线程只有一个而且垃圾回收线程工作的时候其他用户线程要停下来 Parnew Serial的多线程版本有多个垃圾回收线程垃圾回收线程工作的时
黎明觉醒火种测试服务器维护,黎明觉醒火种测试什么时候上线黎明觉醒火种测试资格获取方式(图文)...

黎明觉醒是腾讯旗下的多人开放世界生存手游对标的就是网易旗下的明日之后在之前的曙光测试之后这款游戏长时间来都没有传出过新消息下面game234就来介绍一下黎明觉醒最新的火种测试什么时候上线怎么预约 game234将第一时间提供黎明觉
区块链是怎么形成的，你究竟明白多少?

区块链到底是啥首先不要把区块链想的很复杂其实区块链很简单它本质上就是一套数据库存储系统该系统分布在全球各地并且能够协同运转不过与其他数据库存储系统不一样的是这个系统的运行者可以是任何有能力架设服务器的人过去传统的数据
angular 12+NG-ZORRO -UI中使用Modal对话框时注意

弹框的代码不能放在循环中不然就会出现黑屏了当时我的代码是这样写的当然这是我的错误写法特此记录 div class pages div
日志服务器搭建

1 安装完系统后配置网络设置静态IP vi etc sysconfig network scripts ifcfg ens33 编辑模式下修改 i BOOTPROTO static 改为静态 ONBOOT YES IPADDR 192
DeFi新篇章

随着原生去中心化中央限价订单簿 Central Limit Order Book CLOB DeepBook的推出 Sui上的DeFi开启了新篇章 DeepBook由一群Sui贡献者共同构建为新一代DeFi应用提供了一个稳定的流动性层通
win10无法访问smb共享文件夹的解决办法

win10无法访问smb共享文件夹的解决办法之前在linux的几个图形化界面都可以在文件夹中输入 smb ip share 直接访问Linux服务器上的共享文件夹但是在win10上进行同样的操作会让我打开win10商店搜索应用程序网上

随机推荐

java（条件分支语句）

Java中的条件分支语句分两种 if else语句和switch语句 1 if 条件判断语句代码A 当条件成立时执行代码A 如果条件不成立则不执行代码A 而是直接执行if的下一句 if 条件代码块1 else 代码块2 当条件成立时执
vscode git 源代码管理无法自动更新显示变更

最近vscode 远程写代码遇到问题 git的源代码管理不能自动罗列被修改的文件原因早期出现警告 Visual Studio Code is unable to watch for file changes in this large
蓝牙HID说明

蓝牙HID说明本章主要围绕BLE的HOGH进行说明网上很多文档讲到HID都要说到USB的HID 让初学者一开始既要看理解蓝牙GATT Service的概念又要去理解USB的端点概念实话来说本人刚去学习时也经常需要尝试去理解这两者的关
hystrix详述（2）- 配置

一 hystrix在生产中的建议 1 保持timeout的默认值 1000ms 除非需要修改其实通常会修改 2 保持threadpool的的线程数为10个除非需要更多 3 依赖标准的报警和监控系统来捕获问题 4 通过dashboards
快应用-华为市场快应用审核总是不通过，无法复现华为审核时的bug【经验贴】

最近完成了一个快应用项目在提交各个市场审核的时候除了华为市场总是不过其他市场 vivo oppo 小米等都很快通过了审核最让人恼火的是华为反馈的bug内容我们尝试各种方法都无法复现无法复现bug就很难定位修改修改bug全
微信小程序一键授权给第三方平台代开发管理（二，一键授权给第三方平台）

不是重点可以忽略本人七月的胜利代表七月份我出生啦嘻嘻博客就是平常记录一些常用到的开发常用到的技术方法等看见好东西了就自己整理一下防止以后自己遇到了再找不到如果有幸帮到你欢迎点赞评论留言 Thank you 一创建第
linux设置交换内存

查看是否有交换空间 cat proc swaps free h 创建swapfile空间 sudo fallocate l 32G swapfile ls lh swapfile 设置空间权限 sudo chmod 600 swapfile
如何使你的网页视频自动播放嵌入的iframe视频

只需在视频链接后面接上 rel 0 amp autoplay 1
＜Linux内核学习＞文件系统

环境 Linux 0 11 Linux 3 4 2 参考书籍 Linux内核完全剖析基于0 11内核赵炯一 Linux中使用文件系统的部分 1 1关于Linux中高速缓冲区的管理程序 1 2文件系统的底层通用函数对于硬盘的读写分配释
程序员分哪几种，分别薪资是多少

这是本文的目录前言程序员的类别程序员的薪资一般是多少这里着重介绍一下python程序员 python副业介绍 1 兼职处理数据 2 兼职查询资料 3 兼职P图零基础Python学习资料介绍附上Python学习指南零基础Pyth
SQL注入和sql-labs通关1-18（手工注入、高权限注入、文件读写、提交方式、查询方式、WAF绕过、sqlmap）

目录 1 SQL 注入 Injection 概述 2 SQL注入之mysql基础语法 3 mysql系统库释义 3 1 information schema 库
linux查询Centos服务器资源配置和使用情况

目录 1 查询内存 2 查询存储 3 查询cpu信息 4 cpu个数 5 查看cpu核心数 6 查看线程总数 7 查看系统32位还是64位及系统架构 1 查询内存 free h Mem 表示物理内存统计如果机器剩余内存非常小一般小于总内
PADS卡死问题

今天PADS Logic老是一按ctrl键就卡死试了很多方法重装重置都没用最后上网查了一下类似问题发现是输入法导致的因为微软最新的输入法和以往有不兼容的地方最后去设置里面改为兼容之前便解决了问题
java基础（五）——自动装（拆）箱、枚举、String、StringBuffer和StringBuilder、常用类

一自动装箱和拆箱 1 有时候需要将基本类型转换为引用类型对象自动装箱例子基本数据类型 gt 引用数据类型 Integer i 10 相当于Integer i newInteger 10 进行编译时编译器根据上下文判断是否进行自动装箱
python版本onnx模型多输入

onnx模型有两个输出的情况 import onnxruntime def use onnx model model path session onnxruntime InferenceSession model path enable c
node+express 获取微信小程序的session_key和openid

前言使用node来写一个服务接口接收前段的code 然后返回微信小程序的session key和openid 注意小程序的appId必须是企业认证的个人认证的无法通过 wx login 会报错步骤 1 安装插件 request c
下列软件包有未满足的依赖关系： python-catkin-pkg : 依赖: python-catkin-pkg-modules (＞= 0.5.2) 但是它将不会被安装

您也许需要运行 apt fix broken install 来修正上面的错误下列软件包有未满足的依赖关系 python catkin pkg 依赖 python catkin pkg modules gt 0 5 2 但是它将不会被安装
软件测评中心测试项目及测试过程简析，CMA、CNAS软件测试报告获取

软件测试是产品周期中必不可少的一步可以更好的保障软件质量那么我们所知的软件测评中心一般有哪些测试项目以及测试流程是如何和小编一起往下看看吧一软件测评中心的测试项目 1 功能测试通过模拟用户使用场景测试软件的各项功能是否正常稳
带你轻松理解python类的一些基础用法(❁´◡`❁)

总结一下前一段时间的学习成果将我对知识点的简易理解方法分享给大家大家可以参考着学习希望对你们学习python有帮助使用继承开发程序用子类继承父类首先我们先创建一个父类大家可以将我们的这个父类理解为就是外面最大的一个类他的里
How YouPorn Uses Redis: SFW Edition

I interviewed Eric Pickup IT Lead at the Manwin group the company behind sites like YouPorn and Pornhub to tell us about

热门标签