Neo4j

Neo4j

Neo4j 是一款面向关系数据建模的图数据库。本文将从基础概念出发,介绍 Neo4j 的常见用法,包括节点、关系、属性的建模方式,Cypher 查询语句的基本写法,以及如何进行数据创建、查询、更新和关系路径分析,帮助你快速理解图数据库的使用场景和开发流程。

电影图表

1.查询

1.1按属性筛选

MATCH (p:Person {name: 'Tom Hanks'})
RETURN p

1.2返回属性值

MATCH (p:Person {name: 'Tom Hanks'})
RETURN  p.born

1.3使用where语句筛选

MATCH (p:Person)
WHERE p.name = 'Tom Hanks' OR p.name = 'Rita Wilson'
RETURN p.name, p.born

不等于判断:

MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name <> 'Tom Hanks' // p的名称不等于Tom Hanks的人
AND m.title = 'Captain Phillips'
RETURN p.name

1.4关系查找

MATCH (p:Person {name: 'Tom Hanks'})-[:ACTED_IN]->(m:Movie)
WHERE m.title='The Matrix'
RETURN m.title

注:如果不提供其中的Movie标签 则查找所有类型的m

另一种写法:

MATCH (p)-[:ACTED_IN]->(m)
WHERE p:Person AND m:Movie AND m.title='The Matrix'
RETURN p.name

1.4.1范围查询

MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE 2000 <= m.released <= 2003
RETURN p.name, m.title, m.released
MATCH (m:Movie) WHERE m.title = 'Toy Story'
RETURN
    m.year < 1995 AS lessThan, //  Less than (false)
    m.year <= 1995 AS lessThanOrEqual, // Less than or equal(true)
    m.year > 1995 AS moreThan, // More than (false)
    m.year >= 1995 AS moreThanOrEqual // More than or equal (true)

1.4.2判断属性存在

IS NOT NULL

MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name='Jack Nicholson' AND m.tagline IS NOT NULL
RETURN m.title, m.tagline

1.4.3子串过滤

STARTS WITH

ENDS WITH

CONTAINS

MATCH (p:Person)-[:ACTED_IN]->()
WHERE p.name STARTS WITH 'Michael'
RETURN p.name

忽略大小写:

toLower()

toUpper()

MATCH (p:Person)-[:ACTED_IN]->()
WHERE toLower(p.name) STARTS WITH 'michael'
RETURN p.name

注:这个操作将可能导致不走索引!

1.4.4关系过滤

NOT exists(xxx)

MATCH (p:Person)-[:WROTE]->(m:Movie)
WHERE NOT exists( (p)-[:DIRECTED]->(m) )
RETURN p.name, m.title

注:能用模式匹配就不要用 exists,因为性能pattern MATCH > exists

1.4.5列表筛选

IN

提供外部列表:

MATCH (p:Person)
WHERE p.born IN [1965, 1970, 1975]
RETURN p.name, p.born

属性本身是列表:

MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
WHERE  'Neo' IN r.roles AND m.title='The Matrix'
RETURN p.name, r.roles

1.4.6 OPTIONAL匹配

MATCH (m:Movie) WHERE m.title = "Kiss Me Deadly"
MATCH (m)-[:IN_GENRE]->(g:Genre)<-[:IN_GENRE]-(rec:Movie)
OPTIONAL MATCH (m)<-[:ACTED_IN]-(a:Actor)-[:ACTED_IN]->(rec)
RETURN rec.title, a.name

OPTIONAL MATCH不匹配不会影响返回结果中不存在该项,只是这里不匹配的a.name返回的是null,也就是说 会保留所有同类型电影而他们不一定有共同参演的演员

1.5查看节点/图属性

查看节点有哪些属性:

MATCH (p:Person)
RETURN p.name, keys(p)

查看图所有属性:

CALL db.propertyKeys()

1.6使用as给返回值取别名

MATCH (p:Person)
RETURN p.name AS Name

1.7使用PROFILE查询性能

PROFILE MATCH (p:Person)
RETURN p.name AS Name

1.8查看数据模型

这将展示不同标签节点之间存在什么关系:

CALL db.schema.visualization()

查询所有节点有哪些属性类型:

CALL db.schema.nodeTypeProperties()

查询所有关系有哪些属性类型:

CALL db.schema.relTypeProperties()

查看所有约束:

SHOW CONSTRAINTS

1.9对查询结果进行排序

ORDER BY

MATCH (u:User)-[r:RATED]->(m:Movie)
WHERE u.name = 'Sandy Jones'
RETURN m.title AS movie, r.rating AS rating ORDER BY r.rating DESC

ORDER BY支持多个条件排序,使用,隔开

1.10限制返回的结果数量

LIMIT

MATCH (m:Movie)
WHERE m.released IS NOT NULL
RETURN m.title AS title,
m.released AS releaseDate
ORDER BY m.released DESC LIMIT 100

1.11分页

通过skip和limit组合进行分页处理

MATCH (p:Person)
WHERE p.born.year = 1980
RETURN  p.name as name,
p.born AS birthDate
ORDER BY p.born SKIP 40 LIMIT 10

1.12相同返回值去重

MATCH (p:Person)-[:DIRECTED | ACTED_IN]->(m:Movie)
WHERE p.name = 'Tom Hanks'
RETURN DISTINCT m.title, m.released
ORDER BY m.title

1.13Map投影(把节点转成JSON对象)

把整个节点的属性映射为一个person对象:

MATCH (p:Person)
WHERE p.name CONTAINS "Thomas"
RETURN p { .* } AS person
ORDER BY p.name ASC

自定义对象:

MATCH (p:Person)
WHERE p.name CONTAINS "Thomas"
RETURN p { .name, .born } AS person
ORDER BY p.name

1.14修改返回的结果

计算年龄:

MATCH (m:Movie)<-[:ACTED_IN]-(p:Person)
WHERE m.title CONTAINS 'Toy Story' AND
p.died IS NULL
RETURN m.title AS movie,
p.name AS actor,
p.born AS dob,
date().year - p.born.year AS ageThisYear

条件判断返回:

MATCH (m:Movie)<-[:ACTED_IN]-(p:Person)
WHERE p.name = 'Henry Fonda'
RETURN m.title AS movie,
CASE
WHEN m.year < 1940 THEN 'oldies'
WHEN 1940 <= m.year < 1950 THEN 'forties'
WHEN 1950 <= m.year < 1960 THEN 'fifties'
WHEN 1960 <= m.year < 1970 THEN 'sixties'
WHEN 1970 <= m.year < 1980 THEN 'seventies'
WHEN 1980 <= m.year < 1990 THEN 'eighties'
WHEN 1990 <= m.year < 2000 THEN 'nineties'
ELSE  'two-thousands'
END
AS timeFrame

1.15聚合查询

RETURN 中只要有聚合函数,就会按“非聚合字段”自动分组

1.15.1分组聚合

按照a.name, d.name相同为一组,计算其count并进行排序:

MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:DIRECTED]-(d:Person)
RETURN a.name, d.name,
count(*) AS numMovies
ORDER BY numMovies DESC

1.15.2列表聚合

对于每个演员,返回其演的所有电影数及一个电影列表:

MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
RETURN a.name AS actor,
count(*) AS total,
collect(m.title) AS movies
ORDER BY total DESC LIMIT 10

消除列表重复项:

MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
WHERE m.year = 1920
RETURN  collect( DISTINCT m.title) AS movies,
collect( a.name) AS actors

返回第一个演员:

MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
RETURN m.title AS movie,
collect(a.name)[0] AS castMember,
size(collect(a.name)) as castSize

返回第二个及之后的所有演员:

MATCH (a:Person)-[:ACTED_IN]->(m:Movie)
RETURN m.title AS movie,
collect(a.name)[2..] AS castMember,
size(collect(a.name)) as castSize

1.16日期时间处理

date: "2026-02-22"

datetime: "2026-02-22T12:29:41.254000000Z"

time: "12:29:41.254000000Z"

MERGE (x:Test {id: 1})
SET x.date = date(),
    x.datetime = datetime(),
    x.time = time()
RETURN x

访问date或datetime中的部分:

MATCH (x:Test {id: 1})
RETURN x.date.day, x.date.year,
x.datetime.year, x.datetime.hour,
x.datetime.minute

使用字符串来设置:

MATCH (x:Test {id: 1})
SET x.date1 = date('2022-01-01'),
    x.date2 = date('2022-01-15')
RETURN x

MATCH (x:Test {id: 1})
SET x.datetime1 = datetime('2022-01-04T10:05:20'),
    x.datetime2 = datetime('2022-04-09T18:33:05')
RETURN x

返回两个时间之间持续时间:

MATCH (x:Test {id: 1})
RETURN duration.between(x.date1,x.date2)

返回间隔天数:

MATCH (x:Test {id: 1})
RETURN duration.inDays(x.datetime1,x.datetime2).days

返回加6个月的date:

MATCH (x:Test {id: 1})
RETURN x.date1 + duration({months: 6})

使用apoc格式化时间:

MATCH (x:Test {id: 1})
RETURN x.datetime as Datetime,
apoc.temporal.format( x.datetime, 'HH:mm:ss.SSSS')
AS formattedDateTime

格式化为标准ISO8601:

MATCH (x:Test {id: 1})
RETURN apoc.date.toISO8601(x.datetime.epochMillis, "ms")
AS iso8601

1.17图遍历

1.17.1最短路径

shortestPath

找到任意关系连接下最短路径:

MATCH p = shortestPath((p1:Person)-[*]-(p2:Person))
WHERE p1.name = "Eminem"
AND p2.name = "Charlton Heston"
RETURN  p

找到仅存在ACTED_IN关系(不区分方向)的最短路径:

MATCH p = shortestPath((p1:Person)-[:ACTED_IN*]-(p2:Person))
WHERE p1.name = "Eminem"
AND p2.name = "Charlton Heston"
RETURN  p

找到相距两跳的节点:

MATCH (p:Person {name: 'Eminem'})-[:ACTED_IN*2]-(others:Person)
RETURN  others.name

找到最多4条的节点:

MATCH (p:Person {name: 'Eminem'})-[:ACTED_IN*1..4]-(others:Person)
RETURN  others.name

1.18管道查询

1.18.1作用域变量

WITH  'toy story' AS mt, 'Tom Hanks' AS actorName
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WITH m, toLower(m.title) AS movieTitle
WHERE p.name = actorName
AND movieTitle CONTAINS mt
RETURN m.title AS movies, movieTitle

注:WITH和WHERE处于同一阶段,所有这里可以访问p,但是return中访问不了p,mt,actorName

使用with限制返回数:

WITH  'Tom Hanks' AS theActor
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name = theActor
WITH m  LIMIT 2
// possibly do more with the two m nodes
RETURN m.title AS movies

使用with排序:

WITH  'Tom Hanks' AS theActor
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name = theActor
WITH m ORDER BY m.year LIMIT 5
// possibly do more with the five m nodes in a particular order
RETURN m.title AS movies, m.year AS yearReleased

在with中使用映射,指定要返回的字段:

MATCH (n:Movie)
WHERE n.imdbRating IS NOT NULL
AND n.poster IS NOT NULL
WITH n {
  .title,
  .year,
  .languages,
  .plot,
  .poster,
  .imdbRating,
  directors: [ (n)<-[:DIRECTED]-(d) | d { tmdbId:d.imdbId, .name } ]
}
ORDER BY n.imdbRating DESC LIMIT 4
RETURN collect(n)

注:collect会打断流式传输,必须把所有n算出来在返回给客户端

1.18.2列表推导式

[ pattern | expression ] : 左边是match 右边是指定每一项的结果

MATCH (n:Movie)
WHERE n.imdbRating IS NOT NULL
WITH n {
  .title,
  .imdbRating,
  actors: [ (n)<-[:ACTED_IN]-(p) | p { .imdbId, .name } ],
  genres: [ (n)-[:IN_GENRE]->(g) | g {.name} ]
}
ORDER BY n.imdbRating DESC
LIMIT 10
RETURN collect(n)

1.18.3管道

将5个m用作下一步的match:

WITH  'Tom Hanks' AS theActor
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name = theActor
WITH m  LIMIT 5
MATCH (d:Person)-[:DIRECTED]->(m)
RETURN d.name AS director,
m.title AS movies

1.18.4列表展开与管道

UNWIND

返回的行将是电影标题和语言属性(重复多行)以及语言值:

MATCH (m:Movie)-[:ACTED_IN]-(a:Actor)
WHERE a.name = 'Tom Hanks'
UNWIND m.languages AS lang
RETURN m.title AS movie,
m.languages AS languages,
lang AS language

返回包含每种语言名称以及该语言的最多 10 部电影标题列表:

MATCH (m:Movie)
UNWIND m.languages AS lang
WITH m, trim(lang) AS language
// this automatically, makes the language distinct because it's a grouping key
WITH language, collect(m.title) AS movies
RETURN language, movies[0..10]

1.19减少内存

1.19.1子查询

MATCH (m:Movie)
CALL {
    WITH m
    MATCH (m)<-[r:RATED]-(u:User)
     WHERE r.rating = 5
    RETURN count(u) AS numReviews
}
RETURN m.title, numReviews
ORDER BY numReviews DESC

上述查询等效于:

for each m:
    进入子查询
    只处理这一部电影
    立即算 count
    返回一个数字

避免直接查询出现以下问题:

先展开所有电影×评分
再整体聚合

注:子查询必须通过 WITH 显式接收外层变量

1.19.2合并查询结果

MATCH (m:Movie) WHERE m.year = 2000
RETURN {type:"movies", theMovies: collect(m.title)} AS data
UNION ALL
MATCH (a:Actor) WHERE a.born.year > 2000
RETURN { type:"actors", theActors: collect(DISTINCT a.name)} AS data

等价于:

return [
  {
    data:{
      type:"movies",
      theMovies:[...]
    }
  },
  {
    data:{
      type:"actors",
      theActors:[...]
    }
  }
]

注:UNION ALL不会去重,但是一般也不推荐直接用UNION,因为会吃性能和内存

配合子查询使用:

MATCH (p:Person)
WITH p LIMIT 100
CALL {
  WITH p
  OPTIONAL MATCH (p)-[:ACTED_IN]->(m:Movie)
  RETURN m.title + ": " + "Actor" AS work
UNION
  WITH p
  OPTIONAL MATCH (p)-[:DIRECTED]->(m:Movie)
  RETURN m.title+ ": " +  "Director" AS work
}
RETURN p.name, collect(work)

UNION使得可以同时输出两种类型work

1.20参数

1.20.1单参数

:param number: 10 // 此形式会自动转为浮点数
:param number=> 10  // 此形式会使用整型

1.20.2多参数

:params {actorName: 'Tom Cruise', movieName: 'Top Gun'}

追加参数:

:param number=> 10

1.20.3查看及删除参数

查看参数:

:params

删除参数:

:params {}

2.写入

2.1创建节点

创建单个:

MERGE (p:Person {name: 'Michael Caine'})

CREATE (p:Person {name: 'Michael Caine'})

创建多个:

MERGE (p:Person {name: 'Katie Holmes'})
MERGE (m:Movie {title: 'The Dark Knight'})
RETURN p, m


CREATE (p:Person {name: 'Katie Holmes'})
CREATE (m:Movie {title: 'The Dark Knight'})
RETURN p, m

注:CREATE和MERGE区别是CREATE会永远新建一个新节点,但是MERGE会查看是否有重复节点 重复就会复用(创建时属性完全一致才行 否则也会新建节点的)

2.2创建关系

先查找节点并创建关系:

MATCH (p:Person {name: 'Michael Caine'})
MATCH (m:Movie {title: 'The Dark Knight'})
MERGE (p)-[:ACTED_IN]->(m)

创建节点并同时设置关系:

MERGE (p:Person {name: 'Chadwick Boseman'})
MERGE (m:Movie {title: 'Black Panther'})
MERGE (p)-[:ACTED_IN]-(m)
MERGE (p:Person {name: 'Emily Blunt'})-[:ACTED_IN]->(m:Movie {title: 'A Quiet Place'})
RETURN p, m

注:若不指定关系的方向,默认情况是从左到右

2.3设置关系属性

MATCH (p:Person {name: 'Michael Caine'})
MERGE (m:Movie {title: 'Batman Begins'})
MERGE (p)-[:ACTED_IN {roles: ['Alfred Penny']}]->(m)
RETURN p,m

通过apoc程序添加关系:

MATCH (n:Actor)-[:ACTED_IN]->(m:Movie)
CALL apoc.merge.relationship(n,
  'ACTED_IN_' + left(m.released,4),
  {},
  {},
  m ,
  {}
) YIELD rel
RETURN count(*) AS `Number of relationships merged`;

建立新的关系ACTED_IN_xxx,用于表示演员在什么年份参演了某部电影,防止在直接查询过程中遍历全部电影

2.4属性支持的类型

分类 类型 是否支持 示例 说明
数值 Integer {age: 30} 整数
数值 Float {price: 19.9} 浮点数
字符串 String {name: 'Alice'} 最常用
布尔 Boolean {active: true} true/false
时间 Date {d: date('2025-01-01')} 日期
时间 Datetime {t: datetime()} 含时区
时间 LocalDateTime {t: localdatetime()} 无时区
时间 Time {t: time()} 带时区时间
时间 LocalTime {t: localtime()} 本地时间
时间 Duration {dur: duration('P1D')} 时长
空间 Point {loc: point({latitude:40.7, longitude:-74})} 地理坐标
数组 同类型数组 {tags:['a','b']} 必须同类型
数组 数值数组 {nums:[1,2,3]} 支持
数组 布尔数组 {flags:[true,false]} 支持
数组 嵌套数组 {bad:[[1],[2]]} 不允许
数组 混合类型数组 {bad:[1,'a']} 不允许
复杂 Map/Object {info:{age:18}} 不允许
复杂 JSON {json:{...}} 不允许
复杂 Node {u:node} 不允许
复杂 Relationship {r:rel} 不允许

2.5设置属性

支持同时设置多个

MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
WHERE p.name = 'Michael Caine' AND m.title = 'The Dark Knight'
SET r.roles = ['Alfred Penny'], m.released = 2008
RETURN p, r, m

在创建/匹配成功阶段执行set:

// Find or create a person with this name
MERGE (p:Person {name: 'McKenna Grace'})

// Only set the `createdAt` property if the node is created during this query
ON CREATE SET p.createdAt = datetime()

// Only set the `updatedAt` property if the node was created previously
ON MATCH SET p.updatedAt = datetime()

// Set the `born` property regardless
SET p.born = 2006

RETURN p

2.6删除属性

MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
WHERE p.name = 'Michael Caine' AND m.title = 'The Dark Knight'
REMOVE r.roles
RETURN p, r, m

通过设置为null删除

MATCH (p:Person)
WHERE p.name = 'Gene Hackman'
SET p.born = null
RETURN p

注:绝对不能删除用作节点主键的属性。

2.7删除节点及关系

删除节点:

MATCH (p:Person {name: 'Jane Doe'})
DELETE p

删除节点和关系:

MATCH (p:Person {name: 'Jane Doe'})-[r:ACTED_IN]->(m:Movie {title: 'The Matrix'})
DELETE r
RETURN p, m

注:必须先删除关系再删除节点,单独删除节点会报错

删除节点同时删除与之相关关系:

MATCH (p:Person {name: 'Jane Doe'})
DETACH DELETE p

2.8整库删除节点及关系

谨慎使用!!

MATCH (n)
DETACH DELETE n

2.9添加和删除标签

添加标签:

MATCH (p:Person {name: 'Jane Doe'})
SET p:Developer
RETURN p

删除标签:

MATCH (p:Person {name: 'Jane Doe'})
REMOVE p:Developer
RETURN p

2.10列出全部标签

CALL db.labels()

2.11批量添加标签

为存在某个关系的标签添加额外标签:

MATCH (p:Person)
WHERE exists ((p)-[:ACTED_IN]-())
SET p:Actor

2.12将属性值重构为节点

MATCH (m:Movie)
UNWIND m.languages AS language

MERGE (l:Language {name:language})
MERGE (m)-[:IN_LANGUAGE]->(l)
SET m.languages = null

这段代码遍历所有电影节点,并为找到的每种语言创建一个语言节点,然后使用IN_LANGUAGE关系在**电影节点和语言节点之间创建关系。

等价于js伪代码:

for (const m of movies) {
  for (const language of m.languages) {
    MERGE (l:Language {name:language})
	MERGE (m)-[:IN_LANGUAGE]->(l)
  }
}